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PATENT AGENT 

ALLEN C. YUN, Ph.D. 



RE; New Divisional Patent Application in U.S. 
Applicant (s) : Takanori OKURA et al. 

Title: GENOMIC DNA ENCODING A POLYPEPTIDE CAPABLE OF INDUCING 

THE PRODUCTION OF INTERFERON- y 
Attv's Docket: 0KURA=1A 

Sir: 

Attached herewith is the above- identified application for Letters Patent 
including : 



[X] Specification (29 pages) , claims (4 pages) and abstract (1 page) 
[X] 1 Sheet Drawings (Figure 1) 

[X] Formal [ ] Informal 
[X] Declaration and Power of Attorney ( pages) 

[ ] Newly executed [X] Copy from prior application no. 08/884,324 
[X] Preliminary Amendment 

[ ] Computer -readable Sequence Listing 
[ ] Supplemental Preliminary Amendment 

[ ] Information Disclosure Statement with ( ) references 

[ j A verified statement uo establish small encicy status under 3 7 CFR 

§1.9 and 37 CFR §1.27 ( page(s)) 
[X] A check in the amount of $ 760 . 00 (check no. 24556) to cover: 
[X] The filing fee calculated as follows (including any preliminary 

amendment for entry prior to calculation of the filing fee) : 



CLAIMS AS FILED 


FOR 


NUMBER FILED 


NUMBER EXTRA 


RATE 


BASIC FEE 
$ 760.00 


TOTAL 
CLAIMS 


17 - 20 


0 


x 18 




INDEPENDENT 
CLAIMS 


3 - 3 


0 


x 78 




[ ] Multiple Dependent Claim 
Presented 




x260 




[ ] Reduction of X A for small entity 








TOTAL FILING FEE 


$ 760.00 



In re of 



[ ] Any additional fee required by the filing of an enclosed 

preliminary or supplemental preliminary amendment (for entry after 
calculation of the filing fee) has been calculated as shown below: 





CLAIMS 
REMAINING 

AFTER 
AMENDMENT 


HIGHEST 
NO. 
PREVIOUSLY 
PAID FOR 


PRESENT 
EXTRA 


RATE 


CALCULATION 


TOTAL 








X $18.00 


$ 


INDEP 








X 78.00 


$ 


[ ] Multiple Dependent Claim Presented 


x $260.00 


$ 


Total of Above Calculations = 


$ 


Reduction by X A for filing by small entity 


"$ 


Total Additional Fee = 


$ 



[ ] Other Fees: . 

[ ] Other Attachments: . 

[X] Return Receipt Postcard (in duplicate) 

The following statements are applicable: 

[X] The benefit under 35 U.S.C. §119 is claimed of the filing date of: 
Application No. 185305/1996 in Japan on 27 June 1996 . A certified 
copy of said priority document [ ] is attached [X] was filed in 
progenitor case 08/884 , 324 on October 6, 1997 . 

[X] The present application is a [ ] Continuation [X] Division 

[ ] Continuation-in-part of prior application No. 08/884 , 324 . 

[X] Incorporation By Reference . The entire disclosure of the prior 

application, from which a copy of the oath or declaration is supplied 
herewith, is considered as being part of the disclosure of the 
accompanying application and is hereby incorporated by reference therein. 

[ ] A signed statement deleting inventor (s) named in the prior application is 
attached. 

[X] The prior application was assigned to: KABUSHIKI KAISHA HAYAS H I BARA 

SEIBUTSU SAGAKU KENKYUJQ , 2-3, 1-chome, Shimoishii , Okavama-shi, Okayama, 
Japan . 

[ ] Amend the specification by inserting before the first line the sentence: 

--This is a continuation division of copending parent application 

Serial No. , filed .-- 

[X] Certain documents were previously cited or submitted to the Patent and 

Trademark Office in the following prior application 08/884 , 324 , which is 
relied upon under 35 U.S.C. §120. Applicants identify these documents by 
attaching hereto a form PTO-1449 listing these documents, and request 
that they be considered and made of record in accordance with 3 7 CFR 
§1. 98(d). Per Section 1.98(d), copies of these documents need not be 
filed in this application. 

[ ] A verified statement claiming small entity status is enclosed in 

progenitor application no. , filed . Status is 

still proper and desired. 



In re of 



[X] The paper copy of the Sequence Listing in this application is identical to 
the computer- readable copy of the Sequence Listing filed June 27, 1997, in 
application no. 08/884,324. In accordance with 37 CFR §1. 821(e), please 
use the last -filed computer readable form filed in that application as the 
computer readable form for the instant application. It is understood that 
the Patent and Trademark Office will make the necessary change in 
application number and filing date for the instant application. A paper 
copy of the Sequence Listing is included in the originally- filed 
specification of the instant application (or included in a separately filed 
preliminary amendment for incorporation into the specification) . 

[ ] The undersigned attorney of record hereby revokes the powers of attorney 
of: 

[ ] The undersigned attorney of record hereby appoints associate power of 

attorney, to prosecute this application and to transact all business in the 
Patent and Trademark Office in connection therewith to: 

[X] The Commissioner is hereby authorized to charge payment of the following 
additional fees associated with this communication or credit any 
overpayments to Deposit Account No. 02-4035: 
[X] Any additional filing fees required under 37 CFR §1.16. 
[X] Any patent application processing fees under 37 CFR §1.17. 

[X] The Commissioner is hereby authorized to charge payment of the following 
fees, based on any paper filed during the pendency of this application or 
any CPA thereof, to effect any amendment, petition, or other action 
requested in said paper or credit any overpayments to Deposit Account No. 
02-4035: 

[X] Any patent application processing fees under 37 CFR §1.17. 

[ ] The issue fee set in 37 CFR §1.18 at or before mailing the Notice of 

Allowance, pursuant to 37 CFR §1. 311(b). 
[X] Any filing fees under 3 7 CFR §1.16 for presentation of extra claims. 
[X] If a paper is untimely filed in this or any CPA thereof by 



Applicant (s) , the Commissioner is hereby petitioned under 37 CFR 
§1.136 (a) for the minimum extension of time required to make said 
paper timely. In the event a petition for extension of time is made 
under the provisions of this paragraph, the Commissioner is hereby 
requested to charge any fee required under 37 CFR §1.17 to Deposit 
Account 02-4035. 



[X] The Commissioner is hereby authorized to credit any overpayment of fees 
accompanying this paper to Deposit Account No. 02-4035. 




Allen C?. Yuri 
Registration No. 37,971 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

ATTY . 1 S DOCKET: OKURA-1A 



In re Application of: 

Takanori OKURA et al ♦ 

Serial No.: NOT YET ASSIGNED 
(Divisional of 08/884,324) 

Filed: ON EVEN DATE HEREWITH 

For: GENOMIC DNA ENCODING A 

POLYPEPTIDE CAPABLE OF... 



Art Unit: 
Examiner : 
Washington, D.C. 

January 10, 2000 



PRELIMINARY AMENDMENT 

Honorable Commissioner of Patents and Trademarks 
Washington, D.C. 20231 

Sir: 

Contemporaneous with the filing of this case and prior 
to calculation of a filing fee and examination on the merits, 
kindly amend as follows: 



IN THE SPECIFICATION 

Page 1, after the title and before " Background of the 
Invention " . insert -- CROSS-REFERENCE TO RELATED APPLICATIONS 

This is a divisional of copending parent application 
serial no. 08/884,324, filed June 27, 1997.-- 

Page 12, line 24, after n ggc-3 ! ", insert 
--(SEQ ID N0:16)--; and 

line 26, after "tgc-3 f " , insert 

(SEQ ID NO:17) -- . 

Page 13, line 13, after "ggt-3 I,! , insert 
--(SEQ ID NO: 18)--; and 



Division of 08/884,324 

line 15, 

-- (SEQ ID NO: 19) -- . 

Page 14, line 16, 
- - (SEQ ID NO: 20)--; and 

line 25, 

-- (SEQ ID NO:21) -- . 

Page 15 , line 14 , 
- - (SEQ ID NO:22)--; and 

line 18, 

-- (SEQ ID NO:23) -- . 

Page 16 , line 12 , 
- - (SEQ ID NO: 24)--; and 



line 

(SEQ ID NO: 25) -- . 

Page 17, line 
-- (SEQ ID NO:26) --; 

line 

(SEQ ID NO;27) --; 

line 

- - (SEQ ID NO: 28)--; and 

line 

-- (SEQ ID NO: 29) -- . 

Page 18, line 
--(SEQ ID NO: 30)--; and 

line 



after r, tgc-3 ,n , insert 

after "tcc-3 1 " , insert 

after M cac-3'" , insert 

after "cgg-3 1 " , insert 

after n ttg-3 T " , insert 

after "tgc-3 IH , insert 

16, after "-3 ,n , insert 

4, after "atc-3' n , insert 

8, after "ttg-3 ,,! , insert 

22, after "ctc-3 1 ", insert 

26, after n ttg-3 ,n , insert 

11, after "tcc-3 1 ", insert 

20, after "tac-3 ! " , insert 
- 2 - 



Division of 08/884,324 



(SEQ ID NO:31) -- . 

Page 19, line 11, change eukalyotic" to read 
eukaryotic- - ; 

line 25, delete "Patent Kokai No. 193, 098/96" , 
and insert therefor --patent application--. 

Page 20, line 15, after "gta-3 1 ", insert 
--(SEQ ID NO: 32)--; and 

line 18, after l, ttg-3 ,,! , insert 

-- (SEQ ID NO: 33) -- . 

Page 21, line 5, after n -3 ,n , insert 
- - (SEQ ID NO: 34)--; and 

line 8, after "atc-3'", insert 

(SEQ ID NO:35) -- . 

Page 22, line 19, change "abut" to read --about--. 
Page 26, line 20, delete "or without"; and 

line 21, delete "or 50 units/ml recombinant 
human interleukin 2". 

Page 27, lines 17-18 from the bottom, delete "The IFN- 
Y production is enhanced in combination with concanavalin A or 
interleukin 2 as a cof actor." 



REMARKS 

The amendments to the specification are made to provide 
consistency with the specification as amended in the parent 
application . 

- 3 - 



e 



Division of 08/884,324 



Favorable consideration is respectfully solicited. 



Respectfully submitted, 

BROWDY AND NE1MARK, P.L.L.C. 
Attorneys for/} Applicant (s) 



By: 




Registration No. 37,971 



ACY : pr 

624 Ninth Street, N.W. 
Suite 300 

Washington, D.C. 20001 
Facsimile: (202) 737-3528 
Telephone: (202) 628-5197 



f : \AMK . pr\amd\okuralA . wpd 
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Genomic DNA encoding a polypeptide capable of inducing the 

production of interferon- y 

Background of the Invention 

Field of the Invention 

The present invention relates to a genomic DNA, more 
particularly, a genomic DNA encoding a polypeptide capable of 
inducing the production of interferon- y (hereinafter abbreviated 
as "IFN-y") by immunocompetent cells. 
Description of the Prior Art 

The present inventors successfully isolated a 
polypeptide capable of inducing the production of IFN-y by 
immunocompetent cells and cloned a cDNA encoding the 
polypeptide, which is disclosed in Japanese Patent Kokai 
No. 27, 189/96 and 193,098/96* Because the present polypeptide 
possesses the properties of enhancing killer cells 1 cytotoxicity 
and inducing killer cells T formation as well as inducing IFN-y, 
a useful biologically active protein, it is expected to be 
widely used as an agent for viral diseases, microbial diseases, 
tumors and/or immunopathies, etc. 

It is said that a polypeptide generated by a gene 
expression may be partially cleaved and/or glycosylated by 
processing with intracellular enzymes in human cells. A 
polypeptide to be used in therapeutic agents should be 
preferably processed similarly as in human cells, whereas human 
cell lines generally have a disadvantage of less producing the 
present polypeptide, as described in Japanese Patent Application 
No. 269, 105/96. Therefore, recombinant DNA techniques should be 
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applied to obtain the present polypeptide in a desired amount. 
To produce the polypeptide processed similarly as in human cells 
using recombinant DNA techniques, mammalian cells should be used 
as the hosts. 

Summary of the Invention 

In view of foregoing, the first object of the present 
invention is to provide a DNA which efficiently expresses the 
polypeptide production when introduced into a mammalian host 
cell. 

The second object of the present invention is to 
provide a transformant into which the DNA is introduced. 

The third object of the present invention is to 
provide a process for preparing a polypeptide, using the 
transformant . 

[Means to Attain the Object] 

The present inventors T energetic studies to attain the 
above objects succeeded in the finding that a genomic DNA 
encoding the present polypeptide efficiently expresses the 
polypeptide production when introduced into mammalian host 
cells. They found that the polypeptide thus obtained possessed 
significantly higher biological activities than that obtained 
by expressing a cDNA encoding the polypeptide in Escherichia 
coll. 

The first object of the present invention is attained 
by a genomic DNA encoding a polypeptide with the amino acid 
sequence of SEQ ID N0:1 (where the symbol "Xaa" means 
"isoleucine" or "threonine") or its homologous one, which 
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induces interferon- y production by immunocompetent cells. 

The second object of the present invention is attained 
by a transformant formed by introducing the genomic DNA into a 
mammalian host cell. 

The third object of the present invention is attained 
by a process for preparing a polypeptide, which comprises (a) 
culturing the transformant in a nutrient medium, and (b) 
collecting the polypeptide from the resultant culture. 

Brief Explanation of the Accompanying Drawings 

FIG.l is a restriction map of a recombinant DNA 
containing a genomic DNA according to the present invention. 

Explanation of the symbols are as follows: The symbol 
"tfin dill" indicates a cleavage site by a restriction enzyme H±n 
dill, and the symbol "HulGIF" indicates a genomic DNA according 
to the present invention. 

Detailed Description of the Invention 

The f ollowings are the preferred embodiments according 
to the present invention. This invention is made based on the 
identification of a genomic DNA encoding the polypeptide with 
the amino acid sequence of SEQ ID N0:1 or its homologous one, 
and the finding that the genomic DNA efficiently expresses the 
polypeptide with high biological activities when introduced into 
mammalian host cells. The genomic DNA of the present invention 
usually contains two or more exons, at least one of which 
possesses a part of or the whole of the nucleotide sequence of 
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SEQ ID NO: 2, The wording "a part" includes a nucleotide and a 
sequential nucleotides consisting of two or more nucleotides in 
SEQ ID NO: 2. Examples of the exons are SEQ ID N0s:3 and 4. 
Human genomic DNA may contain additional exons with SEQ ID NOs : 5 
to 7. Since the present genomic DNA is derived from a mammalian 
genomic DNA, it contains introns, as a distinctive feature in 
mammalian genomic DNAs * The present genomic DNA usually has two 
or more introns such as SEQ ID NOs: 8 to 12. 

More particular examples of the present genomic DNA 
include DNAs with SEQ ID NOs: 13 and 14 or complementary 
sequences thereunto. The DNAs with SEQ ID NOs: 13 and 14 are 
substantially the same. The DNA with SEQ ID NO: 14 contains 
coding regions for a leader peptide, consisting of the 
nucleotides 15, 607th-15, 685th, 17 , 057th-17 , 068th and 20,452nd- 
20,468th, coding regions for the present polypeptide, consisting 
of the nucleotides 20, 469th-20, 586th, 21 , 921st-22, 054th and 
26, 828th-27, 046th, and regions as introns, consisting of the 
nucleotides 15, 686th-17, 056th, 17, 069-20, 451st, 20, 587th- 
21,920th and 22, 055th-26, 827th, The genomic DNA with SEQ ID 
NO: 13 is suitable for expressing the polypeptide in mammalian 
host cells. 

Generally in this field, when artificially expressing 
a DNA encoding a polypeptide in a host, one or more nucleotides 
in a DNA may be replaced by different ones, and appropriate 
promoter(s) and/or enhancer(s) may be linked to the DNA to 
improve the expressing efficiency or the properties of the 
expressed polypeptide. The present genomic DNA can be altered 
similarly as above. Therefore, as far as not substantially 
changing in the biological activities of the expressed 
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polypeptides, the present genomic DNA should include DNAs 
encoding functional equivalents of the polypeptide, formed as 
follows: One or more nucleotides in SEQ ID N0s:3 to 14 are 
replaced by different ones, the untranslated regions and/or the 
coding region for a leader peptide in the 5 T - and/or 3 1 -termini 
of SEQ ID N0s:3, 4, 5, 6, 7, 13 and 14 are deleted, and 
appropriate oligonucleotides are linked to either or both ends 
of SEQ ID NO: 13. 

The present genomic DNA includes general DNAs which 
are derived from a genome containing the nucleotide sequences 
as above, and it is not restricted to its sources or origins as 
far as it is once isolated from its original organisms. For 
example, the present genomic DNA can be obtained by chemically 
synthesizing based on SEQ ID N0s:2 to 14, or by isolating from 
a human genomic DNA. The isolation of the present genomic DNA 
from such a human genomic DNA comprises (a) isolating a genomic 
DNA from human cells by conventional methods, (b) screening the 
genomic DNA with probes or primers, which are chemically 
synthesized oligonucleotides with a part of or the whole of the 
nucleotide sequence of SEQ ID NO: 2, and (c) collecting a DNA to 
which the probes or primers specifically hybridize. Once the 
present genomic DNA is obtained, it can be unlimitedly 
replicated by constructing "a recombinant DNA with an 
autonomously replicable vector by conventional method and then 
introducing the recombinant DNA into an appropriate host such 
as a microorganism or an animal cell before culturing the 
transformant or by applying a PCR method. 

The present genomic DNA is very useful in producing 
the polypeptide by recombinant DNA techniques since it 
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efficiently expresses the polypeptide with high biological 
activities when introduced into mammalian host cells. The 
present invention further provides a process for preparing a 
polypeptide using a specific genomic DNA, comprising the steps 
of (a) culturing a transformant formed by introducing the 
present genomic DNA into mammalian host cells, and (b) 
collecting the polypeptide which induces IFN-y production by 
Immunocompetent cells from the resultant culture. 

The following explains the process for preparing the 
polypeptide according to the present invention. The present 
genomic DNA is usually introduced into host cells in the form 
of a recombinant DNA. The recombinant DNA, comprising the 
present genomic DNA and an autonomously replicable vector, can 
be relatively easily prepared by conventional recombinant DNA 
techniques when the genomic DNA is available. The vectors, into 
which the present genomic DNA can be inserted, include plasmid 
vectors such as pcD, pcDL-SRa, pKY4, pCDM8, pCEV4 and pME18S. 
The autonomously replicable vectors usually further contain 
appropriate nucleotide sequences for the expression of the 
present recombinant DNA in each host cell, which include 
sequences for promoters, enhancers, replication origins, 
transcription termination sites, splicing sequences and/or 
selective markers. Heat shock protein promoters or iFN-a 
promoters, as disclosed in Japanese Patent Kokai No . 163 , 368/95 
by the same applicant of this invention, enables to artificially 
regulate the present genomic DNA expression by external stimuli. 

To insert the present genomic DNA into vectors, 
conventional methods used in this field can be arbitrarily used: 
Genes containing the present genomic DNA and autonomously 
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replicable vectors are cleaved with restriction enzymes and/or 
ultrasonic, and the resultant DNA fragments and the resultant 
vector fragments are ligated. To cleave genes and vectors by 
restriction enzymes, which specifically act on nucleotides, more 
particularly, Reel, BamHl, Bgrlll, BstXI, EcoRl, #±ndlll, Not! , 
Pstl, Sad, Sail, Smal, Spel, Xial, Xhol, etc, facilitate the 
ligation of the DNA fragments and the vector fragments* To 
ligate the DNA fragments and the vector fragments, they are, if 
necessary, first annealed, then treated with a DNA ligase in 
vivo or in vitro. The recombinant DNAs thus obtained can be 
unlimitedly replicated in hosts derived from microorganisms or 
animals . 

Any cells conventionally used as hosts in this field 
can be used as the host cells: Examples of such are epithelial, 
interstitial and hemopoietic cells, derived from human, monkey, 
mouse and hamster, more particularly, 3T3 cells, C127 cells, CHO 
cells, CV-1 cells, COS cells, HeLa cells, MOP cells and their 
mutants. Cells which inherently produce the present polypeptide 
also can be used as the host cells: Example of such are human 
hemopoietic cells such as lymphoblasts, lymphocytes, monoblasts, 
monocytes, myeloblasts, myelocytes, granulocytes and 
macrophages, and human epithelial and interstitial cells derived 
from solid tumors such as pulmonary carcinoma, large bowel 
cancer and colon cancer. More particular examples of the latter 
hemopoietic cells are leukemia cell lines such as HBL-38 cells, 
HL-60 cells ATCC CCL240, K-562 cells ATCC CCL243, KG-1 cells 
ATCC CCL246, Mo cells ATCC CRL8066, THP-1 cells ATCC T1B202, U- 
937 cells ATCC CRL1593-2, described by J . Minowada et al. in 
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"Cancer Research", Vol.10, pp. 1-18 (1988), derived from 
leukemias or lymphoma including myelogenous leukemias, 
promyelocytic leukemias, monocytic leukemias, adult T-cell 
leukemias and hairy cell leukemias, and their mutants. The 
present polypeptide-processibility of these leukemia cell lines 
and their mutants is so distinguished that they can easily yield 
the polypeptide with higher biological activities when used as 
hosts . 

To introduce the present DNA into the hosts, 
conventional methods such as DEAE-dextran method, calcium 
phosphate transfection method, electroporation method, 
lipofection method, microinjection method, and viral infection 
method as using retrovirus, adenovirus, herpesvirus and vaccinia 
virus, can be used. The polypeptide-producing clones in the 
transformants can be selected by applying the colony 
hybridization method or by observing the polypeptide production 
after culturing the transformants in culture media. For 
example, the recombinant DNA techniques using mammalian cells 
as hosts are detailed in " Jlkken-Igaku-Bessatsu Saibo-Kogaku 
Handbook (The handbook for the cell engineering)" (1992), edited 
by Toshio KUROKI, Masaru TANIGUCHI and Mitsuo OSHIMURA, 
published by YODOSHA. CO., LTD., Tokyo, Japan, and "Jlkken- 
Igaku-Bessatsu Blomanual Series 3 Idenshi Cloning Jlkken-Ho (The 
experimental methods for the gene cloning)" (1993), edited by 
Takahi YOKOTA and Ken-ichi ARAI, published by YODOSHA CO., LTD., 
Tokyo, Japan. 

The transformants thus obtained secrete the present 
polypeptide intracellularly and/or extracellularly when cultured 
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in culture media. As the culture media, conventional ones used 
for mammalian cells can be used. The culture media generally 
comprise (a) buffers as a base, (b) inorganic ions such as 
sodium ion, potassium ion, calcium ion, phosphoric ion and 
chloric ion, (c) micronutrients, carbon sources, nitrogen 
sources, amino acids and vitamins, which are added depending on 
the metabolic ability of the cells, and (d) sera, hormones, cell 
growth factors and cell adhesion factors, which are added if 
necessary. Examples of individual media include 199 medium, 
DMEM medium. Ham's F12 medium, IMDM medium, MCDB 104 medium, 
MCDB 153 medium, MEM medium, RD medium, RITC 80-7 medium, RPM1- 
1630 medium, RPMI-1640 medium and WAJC 404 medium. The cultures 
containing the present polypeptide are obtainable by inoculating 
the transformants into the culture media to give a cell density 
of 1 x 10 4 - 1 x 10 7 cells/ml, more preferably, 1 x 10 5 - 1 x 10 6 
cells/ml, and then subjecting to suspension- or monolayer - 
cultures at about 37°C for 1-7 days, more preferably, 2-4 days, 
while appropriately replacing the culture media with a fresh 
preparation of the culture media. The cultures thus obtained 
usually contain the present polypeptide in a concentration of 
about 1-100 iig/ml, which may vary depending on the types of the 
transformants or the culture conditions used. 

While the cultures thus obtained can be used intact 
as an IFN-y inducer, they are usually subjected to a step for 
separating the present polypeptide from the cells or the cell 
debris using filtration, centrif ugation, etc. before use, which 
may follow a step for disrupting the cells with supersonication, 
cell-lytic enzymes and/or detergents if desired, and to a step 
for purifying the polypeptide. The cultures from which the 
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cells or cell debris are removed are usually subjected to 
conventional methods used in this field for purifying 
biologically active polypeptides, such as salting-out, dialysis, 
filtration, concentration, separatory sedimentation, ion- 
exchange chromatography, gel filtration chromatography, 
adsorption chromatography, chromatof ocusing, hydrophobic 
chromatography, reversed phase chromatography, affinity 
chromatography, gel electrophoresis and/or isoelectric focusing. 
The resultant purified polypeptide can be concentrated and/or 
lyophilized into liquids or solids depending on final uses. The 
monoclonal antibodies disclosed in Japanese Patent Kokai 
No. 231, 598/96 by the same applicant of this invention are 
extremely useful to purify the present polypeptide. 
Immunoaf f inity chromatography using monoclonal antibodies yields 
the present polypeptide in a relatively high purity at the 
lowest costs and labors. 

The polypeptide obtainable by the process according 
-to the present invention exerts strong effects in the treatment 
and/or the prevention for IFN-Y- and/or killer cell-susceptive 
diseases since it possesses the properties of enhancing killer 
cells' cytotoxicity and inducing killer cells 1 formation as well 
as inducing IFN-Y, a useful biologically active protein, as 
described above. The polypeptide according to the present 
invention has a high activity of inducing IFN-Y , and this 
enables a desired amount of IFN-Y production with only a small 
amount. The polypeptide is so low toxic that it scarcely causes 
serious side effects even when administered in a relatively-high 
dose. Therefore, the polypeptide has an advantage that it can 
readily induce IFN-Y in a desired amount without strictly 
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cont:rolling the dosage. The uses as agents for susceptive 
diseases are detailed in Japanese Patent Application 
No ,28 ,722/96 by the same applicant of this invention. 

The present genomic DNA is also useful for so-called 
"gene therapy". According to conventional gene therapy, the 
present DNA can be introduced into patients with IFN~y~ and/or 
killer cell-susceptive diseases by directly injecting after the 
DNA is inserted into vectors derived from viruses such as 
retrovirus, adenovirus and adeno-associated virus or is 
incorporated into cationic- or membrane f usible-liposomes, or 
by self-transplanting lymphocytes which are collected from 
patients before the DNA is introduced. In adoptive 

immunotherapy with gene therapy, the present DNA is introduced 
into effector cells similarly as in conventional gene therapy. 
This can enhance the cytotoxicity of the effector cells to tumor 
cells, resulting in improvement of the adoptive immunotherapy. 
In tumor vaccine therapy with gene therapy, tumor cells from 
patients, into which the present genomic DNA is introduced 
similarly as in conventional gene therapy, are self -transplanted 
after proliferated ex vivo up to give a desired cell number. 
The transplanted tumor cells act as vaccines in the patients to 
exert a strong antitumor immunity specifically to antigens. 
Thus, the present genomic DNA exhibits considerable effects in 
gene therapy for diseases including viral diseases, microbial 
diseases, malignant tumors and immunopathies . The general 
procedures for gene therapy are detailed in "Jikken-Igaku- 
Bessatsu Biomanual UP Series Idenshichiryo-no-Kisogijutsu (Basic 
techniques for the gene therapy)" (1996), edited by Takashi 
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ODAJIMA, Izumi SAITO and Keiya OZAWA, published by YODOSHA CO., 
LTD. , Tokyo, Japan. 

The following examples explain the present invention, 
and the techniques used therein are conventional ones used in 
this field: For example, the techniques are described in 
" Jikken-Igaku-Bessatsu Saibo-Kogaku Handbook (The handbook for 
the cell engineering)", (1992), edited by Toshio KUROKI, Masaru 
TANIGUCHI and Mitsuo OSHIMURA, published by YODOSHA CO., LTD., 

Tokyo, Japan, and "Jlkken-Igaku-Bessatsu Blomanual Series 3 
Idenshi Clonong Jlkken-Ho (The experimental methods for the gene 
cloning)" (1993), edited by Takahi YOKOTA and Ken-ichi ARAI , 
published by YODOSHA CO., LTD., Tokyo, Japan. 
Example 1 

Cloning genomic DNA and determination of nucleotide sequence 
Example 1-1 

Determination of partial nucleotide sequence 

Five ng of "PromoterFinder™ DNA PvuII LIBRARY", a 
human placental genomic DNA library commercialized by CLONTECH 
Laboratories, Inc., California, USA, 5 pi of 10 x Tth PCR 
reaction solution, 2.2 pi of 25 mM magnesium acetate, 4 pi of 
2.5 mM dNTP-mixed solution, one pi of the mixed solution of 2 
unit /pi rTth DNA polymerase XL and 2.2 pg/pl Tth Start Antibody 
in a ratio of 4:1 by volume, 10 pmol of an oligonucleotide with 
the nucleotide sequence of 5 f -CCATCCTAATACGACTCACTATAGGGC-3 T as 
an adaptor primer, and 10 pmol of an oligonucleotide with the 
nucleotide sequence of 5 T -TTCCTCTTCCCGAAGCTGTGTAGACTGC-3 1 as an 
anti-sense primer, which was chemically synthesized based on the 
sequence of the nucleotides 88th- 11 5th in SEQ ID NO: 2, were 
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mixed and volumed up to 50 pi with sterilized distilled water. 
After incubating at 94 °C for one min, the mixture was subjected 
to 7 cycles of incubations at 94 °C for 25 sec and at 72 °C for 
4 min, followed by 32 cycles of incubations at 94 °C for 25 sec 
at 67 °C for 4 min to perform PCR. 

The reaction mixture was diluted by 100 folds with 
sterilized distilled water* One pi of the dilution, 5 pi of 10 
x Tth PCR reaction solution, 2.2 pi of 25 mM magnesium acetate, 
4 pi of 2.5 mM dNTP-mixed solution, one pi of the mixed solution 
of 2 unit/pl rTth DNA polymerase XL and 2.2 pg/pl Tth Start 
Antibody in a ratio of 4:1 by volume, 10 pmol of an 
oligonucleotide with the nucleotide sequence of 5 1 - 
CTATAGGGCACGCGTGGT-3 T as a nested primer, and 10 pmol of an 
oligonucleotide with the nucleotide sequence of 5 T - 
TTCCTCTTCCCGAAGCTGTGTAGACTGC-3 f as an anti-sense primer, which 
was chemically synthesized similarly as above, were mixed and 
volumed up to 50 pi with sterilized distilled water. After 
incubating at 94 °C for one min, the mixture was subjected to 5 
cycles of incubations at 94 °C for 25 sec and at 72 °C for 4 min, 
followed by 22 cycles of incubations at 94 °C for 25 sec and at 
67 °C for 4 min to perform PCR for amplifying a DNA fragment of 
the present genomic DNA* The genomic DNA library and reagents 
for PCR used above were mainly from "PromoterFinder™ DNA WALKING 
KITS", commercialized by CLONTECH Laboratories, Inc., 
California, USA 

An adequate amount of the PCR product thus obtained 
was mixed with 50 ng of n pT7 Blue(R)' 1 , a plasmid vector 
commercialized by Novagen, Inc., WI, USA, and an adequate amount 
of T4 DNA ligase, and 100 mM ATP was added to give a final 
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concentration of one mM, followed by incubating at 16 °C for 18 
hr to insert the DNA fragment into the plasmid vector. The 
obtained recombinant DNA was introduced into an Escherichia coll 
JM109 strain by the competent cell method to form a 
transformant, which was then inoculated into L-broth medium ( pH 
7.2) containing 50 pg/ml ampicillin and cultured at 37 °C for 18 
hr. The cells were isolated from the resulting culture, and 
then subjected to the conventional alkali-SDS method to collect 
a recombinant DNA. The dideoxy method analysis confirmed that 
the recombinant DNA contained the DNA fragment with a sequence 
of the nucleotides 5 , 150th-6 , 709th in SEQ ID NO: 14. 
Example 1-2 

Determination of partial nucleotide sequence 

PCR was performed in the same conditions as the first 
PCR in Example 1-1, but an oligonucleotide with the nucleotide 
sequence of 5 T -GTAAGTTTTCACCTTCCAACTGTAGAGTCC-3 1 r which was 
chemically synthesized based on the nucleotide sequence of the 
DNA fragment in Example 1-1, was used as an anti-sense primer. 

The reaction mixture was diluted by 100 folds with 
sterilized distilled water. One yl of the dilution was placed 
into a reaction tube, and PCR was performed in the same 
conditions as used in the second PCR in Example 1-1 to amplify 
another DNA fragment of the present genomic DNA, but an 
oligonucleotide with the nucleotide sequence of 5 1 - 
GGG ATC A AGT AGTGATC AG AAGCAGCACAC- 3 1 , which was chemically 
synthesized based on the nucleotide sequence of the DNA fragment 
in Example 1-1, was used as an anti-sense primer. 

The DNA fragment was inserted into the plasmid vector 
similarly as in Example 1-1 to obtain a recombinant DNA. The 
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recombinant; DNA was replicated in Escherichia coll before being 
collected. The analysis of the collected recombinant DNA 
confirmed that it contained the DNA fragment with a sequence of 
the nucleotides lst-5, 228th in SEQ ID NO: 14. 
Example 1-3 

Determination of partial nucleotide sequence 

0.5 pg of a human placental genomic DNA, 
commercialized by CL0NTECH Laboratories, Inc., California, USA, 
5 pi of 10 x PCR reaction solution, 8 pi of 2.5 mM dNTP-mixed 
solution, one pi of the mixed solution of 5 unit/pl "TAKARA LA 
Taq POLYMERASE" and 1.1 pg/pl "TaqStart ANTIBODY" in a ratio of 
1:1 by volume, both of them are commercialized by Takara Syuzo 
Co., Tokyo, Japan, 10 pmol of an oligonucleotide with the 
nucleotide sequence of 5 1 -CCTGGCTGCCAACTCTGGCTGCTAAAGCGG-3 ! as 
a sense primer, chemically synthesized based on a sequence of 
the nucleotides 46th-75th in SEQ ID NO: 2, and 10 pmol of an 
oligonucleotide with the nucleotide sequence of 5 1 - 
GTATTGTCAATAAATTTCATTGCCACAAAGTTG-3 1 as an anti-sense primer, 
chemically synthesized based on a sequence of the nucleotides 
210th-242nd in SEQ ID NO: 2, were mixed and volumed up to 50 pi 
with sterilized distilled water. After incubating at 94 °C for 
one min, the mixture was subjected to 5 cycles of incubations 
at 98 °C for 20 sec and at 68 °C for 10 min, followed by 25 cycles 
of incubations at 98 °C for 20 sec and 68 °C for 10 min, with 
adding 5 sec in times to every cycle, and finally incubated at 
72 °C for 10 min to amplify further DNA fragment of the present 
genomic DNA. The reagents for PCR used above were mainly from 
"TAKARA LA PCR KIT VERSION 2", commercialized by Takara Syuzo 
Co . , Tokyo , Japan . 
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The DNA fragment was inserted into the plasmid vector 
similarly as in Example 1-1 to obtain a recombinant DNA. The 
recombinant DNA was replicated in Escherichia coll before being 
collected. The analysis of the collected recombinant DNA 
confirmed that it contained the DNA fragment with a sequence of 
the nucleotides 6 , 640th-15, 671st in SEQ ID NO: 14, 
Experiment 1-4 

Determination of partial nucleotid e sequence 

PCR was performed in the same conditions as the PCR 
in Example 1-3 to amplify further another DNA fragment of the 
present genomic DNA; but an oligonucleotide with the nucleotide 
sequence of 5 1 - AAGATGGCT GCT GAACC AGT AG A AG ACAAT T GC - 3 ' , chemically 
synthesized based on a sequence of the nucleotide 175th-207th 
in SEQ ID NO: 2, was used as a sense primer, an oligonucleotide 
with the nucleotide sequence of 5 ' -tccttggtcaatgaagagaacttggtc- 
3 T , chemically synthesized based on a sequence of nucleotides 
334th-360th in the SEQ ID NO: 2, was used as an anti-sense 
primer, and after incubating at 98 °C for 20 sec, the reaction 
mixture was subjected to 30 cycles of incubations at 98 °C for 
20 sec and at 68 °C for 3 min, followed by incubating at 72 °C for 
10 min. 

The DNA fragment was inserted into the plasmid vector 
similarly as in Example 1-1 to obtain a recombinant DNA. The 
recombinant DNA was replicated in Escherichia coll before being 
collected. The analysis of the collected recombinant DNA 
confirmed that it contained the DNA fragment with a sequence of 
the nucleotides 15, 604th-20, 543rd in SEQ ID NO: 14. 
Example 1-5 

Determination of partial nucleotide sequence 
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PCR was performed in the same conditions as the PGR 
in Example 1-4 to amplify further another DNA fragment of the 
present genomic DNA, but an oligonucleotide with the nucleotide 
sequence of 5 ' -CCTGGAATCAGATTACTTTGGCAAGCTTGAATC-3 1 , chemically 
synthesized based on the sequence of the nucleotide 273rd-305th 
in SEQ ID NO: 2, was used as a sense primer, and an 
oligonucleotide with the nucleotide sequence of 5 * - 
GGAAATAATTTTGTTCTCACAGGAGAGAGTTG-3 1 , chemically synthesized 
based on the sequence of nucleotides 500th-531st in the SEQ ID 
NO: 2, was used as an anti-sense primer. 

The DNA fragment was inserted into the plasmid vector 
similarly as in Example 1-1 to obtain a recombinant DNA. The 
recombinant DNA was replicated in Escherichia coll before being 
collected. The analysis of the collected recombinant DNA 
confirmed that it contained the DNA fragment with a sequence of 
the nucleotides 20, 456th-22 , 048th in SEQ ID NO: 14. 
Example 1-6 

Determination of partial nucleotide sequence 

PCR was performed in the same conditions as the PCR 
in Example 1-4 to amplify further another DNA fragment of the 
present genomic DNA, but an oligonucleotide with the nucleotide 
sequence of 5 1 -GCCAGCCTAGAGGTATGGCTGTAACTATCTC-3 1 , chemically 
synthesized based on the sequence of the nucleotide 449th-479th 
in SEQ ID NO: 2, was used as a sense primer, and an 
oligonucleotide with the nucleotide sequence of 5 T - 
GGCATGAAATTTTAATAGCTAGTCTTCGTTTTG-3 ' , chemically synthesized 
based on the sequence of nucleotides 745th-777th in the SEQ ID 
NO: 2, was used as an anti-sense primer. 

The DNA fragment was inserted into the plasmid vector 
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similarly as in Example 1-1 to obtain a recombinant DNA. The 
recombinant DNA was replicated in Escherichia coll before being 
collected. The analysis of the collected recombinant DNA 
confirmed that it contained the DNA fragment with a sequence of 
the nucleotides 21 , 996th-27 , 067th in SEQ ID NO: 14. 
Example 1-7 

Determination of partial nucleotide sequence 

PCR was performed in the same conditions as the first 
PCR in Example 1-2 to amplify further another DNA fragment in 
the present genomic DNA, but an oligonucleotide with the 
nucleotide sequence of 5 ' -GTGACATCATATTCTTTCAGAGAAGTGTCC-3 T r 
chemically synthesized based on the sequence of the nucleotide 
575th-604th in SEQ ID NO: 2, was used as a sense primer. 

The reaction mixture was diluted by 100 folds with 
sterilized distilled water. One pi of the dilution was placed 
into a reaction tube, and PCR was performed in the same 
conditions as the second PCR in Example 1-2 to amplify further 
another DNA fragment of the present genomic DNA, but an 
oligonucleotide with the sequence of 5 1 - 
GCAATTTGAATCTTCATCATACGAAGGATAC-3 1 , chemically synthesized based 
on a sequence of the nucleotides 624th-654th in SEQ ID N0:2, was 
used as a sense primer. 

The DNA fragment was inserted into the plasmid vector 
similarly as in Example 1-1 to obtain a recombinant DNA. The 
recombinant DNA was replicated in Escherichia coll before being 
collected. The analysis of the collected recombinant DNA 
confirmed that it contained the DNA fragment with a sequence of 
-the nucleotides 26 , 914th-28 , 994th in SEQ ID NO: 14. 
Example 1-8 
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Determination of complete nucleotide sequence 

Comparing the nucleotide sequence of SEQ ID NO: 2, 
which was proved to encode the present polypeptide, as disclosed 
in Japanese Patent Kokai No . 193 , 098/96 by the same applicant of 
this invention, with the partial nucleotide sequences identified 
in Examples 1-1 to 1-7, it was proved that the present genomic 
DNA contained the nucleotide sequence of SEQ ID NO: 14. SEQ ID 
NO: 14, consisting of 28,994 base pairs (bp), was extremely 
longer than the SEQ ID NO: 2, consisting of only 471 bp. This 
suggested that SEQ ID NO: 14 contained introns, a characteristic 
of eukalyotic cells. 

It was examined where partial nucleotide sequences of 
SEQ ID NO: 2, i.e., exons, and the donor and acceptor sites in 
introns, respectively consisting of the nucleotides of GT and 
AG, located in SEQ ID NO: 14. Consequently, it was proved that 
SEQ ID NO: 14 contained at least 5 introns, which located in the 
order of SEQ ID N0s:10, 11, 12, 8 and 9 in the direction from 
the 5'- to the 3 '-termini. Therefore, the sequences between the 
neighboring introns must be exons, which were thought to be 
located in the order of SEQ ID N0s:5, 6, 3, 4 and 7 in the 
direction from the 5 1 - to the 3 T -termini. It was also proved 
that SEQ ID NO: 7 contained the 3 T -untranslated region other than 
the exons. The features of the sequence elucidated as this are 
arranged in SEQ ID NO: 14. 

As disclosed in Japanese Patent Kokai No. 193 , 098/96 
by the same applicant of this invention, the present polypeptide 
is produced as a polypeptide with N-terminal amino acid of 
tyrosine other than methionine in human cells, which is observed 
in SEQ ID N0:1. This suggests that the present genomic DNA 
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contains a leader peptide region in the upstream of the 5 1 - 
terminus of the present polypeptide-encoding region. A sequence 
consisting of 36 amino acids encoded by the upstream of the 
nucleotides 20, 469th-20, 471st . which is the nucleotides of TAC, 
are described as a leader peptide in SEQ ID NO: 14. 
Example 2 

Preparation of recombinant DNA pBGHuGF for expression 

0.06 ng of the DNA fragment in Example 1-4 in a 
concentration of 3 ng/50 pl r 0.02 ng of the DNA fragment , 
obtained by the methods in Example 1-5, 5 pi of 10 x LA PCR 
reaction solution, 8 pi of 2.5 mM dNTP-mixed solution, one pi 
of the mixed solution of 5 unit/pl TAKARA LA Taq polymerase and 
1.1 pg/pl TaqStart Antibody in a ratio of 1:1 by volume, 10 pmol 
of an oligonucleotide with the sequence of 5 f - 
TCCGAAGCTTAAGATGGCTGCTGAACCAGTA-3 1 as a sense primer, chemically 
synthesized based on the nucleotide sequence of the DNA fragment 
in Example 1-4, and 10 pmol of an oligonucleotide with the 
nucleotide sequence of 5 ! -GGAAATAATTTTGTTCTCACAGGAGAGAGTTG-3 ' 
as an anti-sense primer, chemically synthesized based on the 
nucleotide sequence of the DNA fragment in Example 1-5, were 
mixed and volumed up to 50 pi with sterilized distilled water. 
After incubating at 94 °C for one min, the mixture was subjected 
to 5 cycles of incubations at 98 °C for 20 sec and at 72 °C for 
7 min, followed by 25 cycles of incubations at 98 °C for 20 sec 
and 68 °C for 7 min to perform PCR. The reaction mixture was 
cleaved by restriction enzymes Hindi II and SphI to obtain a DNA 
fragment of about 5,900 bp, with cleavage sites by Hindlll and 
SphI in its both termini. 
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PCR was performed in the same condition as above, but 
0.02 ng of the DNA fragment in Example 1-5, 0.06 ng of the DNA 
fragment obtained in Example 1-6, an oligonucleotide with the 
nucleotide sequence of 5 T -ATGTAGCGGCCGCGGCATGAAATTTTAATAGCTAGTC- 
3 1 as an anti-sense primer, chemically synthesized based on the 
nucleotide sequence of the DNA fragment in Example 1-6, and an 
oligonucleotide with the sequence of 5'- 
CCTGGAATCAGATTACTTTGGCAAGCTTGAATC-3 T as a sense primer, 
chemically synthesized based on the DNA fragment in Example 1-6, 
were used. The reaction mixture was cleaved by restriction 
enzymes Wot I and SphI to obtain a DNA fragment of about 5,600 
bp, with cleavage sites by NotI and SphI in its both termini. 

A plasmid vector "pRc/CMV", containing a 
cytomegalovirus promoter, commercialized by Invitrogen 
Corporation, San Diego, USA, was cleaved by restriction enzymes 
Hindlll and NotI to obtain a vector fragment of about 5,500 bp. 
The vector fragment was mixed with the above two DNA fragments 
of about 5,900 bp and 5,600 bp, and reacted with T4 DNA ligase 
to insert the two DNA fragments into the plasmid vector. An 
Escherichia coll JM109 strain was transformed with the obtained 
recombinant DNA, and the transformant with the plasmid vector 
was selected by the colony hybridization method. The selected 
recombinant DNA was named as "pBGHuGF" . As shown in FIG.l, the 
present genomic DNA, with the nucleotide sequence of SEQ ID 
NO: 13, was ligated in the downstream of the cleavage site by the 
restriction enzyme Hindi I I in the recombinant DNA. 
Example 3 

Preparation of transformant using CHO cell as host 
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CH0-K1 cells ATCC CCL61 were inoculated into Ham's F12 
medium (pH 7.2) containing 10 v/v % bovine fetal serum and 
proliferated by conventional manner. The proliferated cells 
were collected and washed with phosphate-buffered saline 
(hereinafter abbreviated as "PBS") followed by suspending in PBS 
to give a cell density of 1 x 10 7 cells/ml. 

10 \xg of the recombinant DNA pBGHuGF in Example 2 and 
0.8 ml of the above cell suspension were placed in a cuvette and 
ice-chilled for 10 min. The cuvette was installed in " GENE 
PULSER" , an electroporation device commercialized by Bio-Rad 
Laboratories Inc., Brussels, Belgium, and then pulsed once with 
an electric discharge. After pulsing, the cuvette was 
immediately took out and ice-chilled for 10 min. The cell 
suspension from the cuvette was inoculated into Ham's F12 medium 
(pH 7.2) containing 10 v/v % bovine fetal serum and cultured 
under an ambient condition of 5 v/v % C0 2 at 37 °C for 3 days. 
To the culture medium was added G-418 to give a final 
concentration of 400 ]jg/ml, and the culturing was continued 
further 3 weeks under the same conditions. From abut 100 
colonies formed, 48 colonies were selected, and a portion of 
each was inoculated into a well of culturing plates with Ham's 
F12 medium (pH7.2) containing 400 jig/ml G-418 and 10 v/v % 
bovine fetal serum and cultured similarly as above. Thereafter, 
to each well of the culturing plates was added 10 mM Tris-HCl 
buffer (pH 8.5) containing 5.1 mM magnesium chloride, 0.5 w/v 
% sodium deoxycholate, 1 w/v % NONIDET P-40, 10 yig/ml aprotinin 
and 0.1 w/v % SDS to lyse the cells. 

50 pi aliquot of the cell lysates was mixed with one 
ml of glycerol and incubated at 37 °C for one hour, before the 

- 22 - 



polypeptides in the cell lysates were separated by the SDS- 
polyacrylamide gel electrophoresis. The separated polypeptides 
were transferred to a nitrocellulose membrane in usual manner, 
and the membrane was soaked in the culture supernatant of the 
hybridoma H-l, disclosed in Japanese Patent Kokai No. 231 , 598/9 6 
by the same applicant of this invention, followed by washing 
with 50 mM Tris-HCl buffer containing 0.05 v/v % TWEEN 20 to 
remove an excessive mount of the monoclonal antibody. 
Thereafter, the nitrocellulose membrane was soaked in PBS 
containing rabbit-derived anti-mouse immunoglobulin antibody for 
one hr, which was labeled with horseradish peroxidase, followed 
by washing 50 mM Tris-HCl buffer (pH 7.5) containing 0.05 v/v 
% TWEEN 20 and soaking in 50 mM Tris-HCl buffer (pH 7.5) 
containing 0.005 v/v % hydrogen peroxide and 0.3 mg/ml 
diaminobenzidine to develop colorations. The clone, which 
highly produced the polypeptide, was selected based on the color 
development and named "BGHuGF" . 
Example 4 

Production of polypeptide by transformant and its 

physicochemical properties 

The transformant BGHuGF in Experiment 3 was inoculated 
into Ham's F12 medium (pH 7.2) containing 400 ug/ml G-418 and 
10 v/v % bovine fetal serum, and cultured under an ambient 
condition of 5 v/v % C0 2 at 37 °C for one week. The proliferated 
cells were collected, washed with PBS, and then washing with 10- 
fold volumes " of ice-chilled 20 mM Hepes buffer (pH 7.4), 
containing 10 mM potassium chloride and 0.1 mM 
ethylendiaminetetraacetate bisodium salt, according to the 
method described in "Proceedings of The National Academy of The 
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Sciences, of The USA", vol.86, pp. 5, 227-5, 231 (1989), by M. J. 
Kostura et al. The cells thus obtained were allowed to stand 
in 3-fold volumes of a fresh preparation of the same buffer 
under an ice-chilling condition for 20 min and freezed at -80 °C, 
succeeded by thawing to disrupt the cells. The resulting cells 
were centrifuged to collect the supernatant. 

In parallel, THP-1 cells ATCC TIB 202, derived from 
a human acute monocytic leukemia, was similarly cultured and 
disrupted. Supernatant, obtained by centrifuging the resulting 
cells, was mixed with the supernatant obtained from the 
transformant BGHuGF and incubated at 37 °C for 3 hr to react. 
The reaction mixture was applied to a column with "DEAE- 
SEPHAROSE" , a gel for ion-exchange chromatography , 
commercialized by Pharmacia LKB Biotechnology AB, Upsalla, 
Sweden, equilibrated with 10 mM phosphate buffer (pH 6.6) before 
use. After washing the column with 10 mM phosphate buffer (pH 
6.6), 10 mM phosphate buffer (pH 6.6) with a stepwise gradient 
of NaCl increasing from 0 M to 0.5 M was fed to the column, and 
fractions eluted by about 0.2 M NaCl were collected. The 
fractions were dialyzed against 10 mM phosphate buffer (pH 6.8) 
before applied to a column with "DEAE 5PW", a gel for ion- 
exchange chromatography, commercialized by TOSOH Corporation, 
Tokyo, Japan. To the column was fed 10 mM phosphate buffer (pH 
6.8) with a linear gradient of NaCl increasing from 0 M to 0.5 
M, and fractions eluted by about 0.2-0.3 M NaCl were collected. 

While the obtained fractions were pooled and dialyzed 
against PBS, a gel for immunoaf f inity chromatography with the 
monoclonal antibody were prepared according to the method 
disclosed in Japanese Patent Kokai No. 231, 598/96 by the same 
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applicant of this invention. After the gel were charged into 
a plastic column and washed with PBS, the above dialyzed 
solution was applied to the column. To the column was fed 100 
mM glycine-HCl buffer (pH 2.5), and the eluted fractions, which 
contained a polypeptide capable of inducing the production of 
IFN-Y by immunocompetent cells, were collected. After the 
collected fractions were dialyzed against sterilized distilled 
water and concentrated with a membrane filtration, the resultant 
was lyophilized to obtain a purified solid polypeptide in a 
yield of about 15 mg/l-culture . 
Example for Reference 
Expression in Escherichia coli 

As disclosed in Japanese Patent Kokai No. 193, 098/96, 
a transformant pKHuGF which was obtained by introducing a cDNA 
with the nucleotide sequence of SEQ ID NO: 2 into Escherichia 
coli as a host, was inoculated into L-broth medium containing 
50 yig/ml ampicillin and cultured at 37 °C for 18 hr under shaking 
conditions. The cells were collected by centrifuging the 
resulting culture, and then suspended in a mixture solution (pH 
7.2) of 139 mM NaCl, 7 mM NaH 2 P0 4 and 3 mM Na 2 HP0 4 , followed by 
supersonicating to disrupt the cells. After the cell 

disruptants were centrifuged, the supernatant was subjected to 
purifying steps similarly as in Example 4-1 to obtain a purified 
solid polypeptide in a yield of about 5 mg/l-culture. 

Comparing the yields of the polypeptides in Example 
for Reference and in Example 4-1 shows that the use of a 
transformant, which is formed by introducing a genomic DNA 
encoding the present polypeptide into a mammalian cell as a 
host, strongly elevates the yield of the polypeptide per 
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culture. 
Example 4-2 

Physicochemical property of polypeptide 
Example 4-2(a) 
Biological activity 

Blood were collected from a healthy donor by using a 
syringe containing heparin, and then diluted with 2-fold volume 
of serum-free RPMI-1640 medium (pH 7.4). The blood was overlaid 
on ficoll, commercialized by Pharmacia LKB Biotechnology AB, 
Upsalla, Sweden, and centrifuged to obtain lymphocytes, which 
were then washed with RPMI-1640 medium containing 10 v/v % 
bovine fetal serum before being suspended in a fresh preparation 
of the same medium to give a cell density of 5 x 10 6 cells/ml, 
0.15 ml aliquots of the cell suspension was distributed into 
wells of micro plates with 96 wells. 

To the wells with the cells were distributed 0.05 ml 
aliquots of solutions of the polypeptide in Example 4-1, diluted 
with RPMI-1640 medium (pH 7.4) containing 10 v/v % bovine fetal 
serum to give desired concentrations. 0.05 ml aliquots of fresh 
preparations of the same medium with or without 2.5 ug/ml 
concanavalin A or 50 units/ml recombinant human interleukin 2 
were further added to the wells, before culturing in a 5 v/v % 
C0 2 incubator at 37 °C for 24 hr. After the cultivation, 0.1 ml 
of the culture supernatant was collected from 1 each well and 
examined on IFN-y D Y usual enzyme immunoassay. In parallel, a 
systems as a control using the polypeptide in Reference for that 
in Example 4-1 or using no polypeptide was treated similarly as 
above. The results were in Table 1. IFN-y in Table 1 were 
expressed with international units (IU), calculated based on the 
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IFN-Y standard, Gg23-901-530, obtained from the International 
Institute of Health, USA 



Table 1 



Sample of polypeptide 


IFN-y production (IU/ml) 


Example 4-2(a) 


3.4 x 10 5 


Example for Reference 


1.7 x 10 5 



Table 1 indicates that the lymphocytes as 
immunocompetent cells produce IFN-y by the action of the present 
polypeptide. The IFN-y production is enhanced in combination 
with concanavalin A or interleukin 2 as a cof actor. 

It is more remarkable that the polypeptide in Example 
4-1 could induce IFN-y production more than that in Example for 
Reference. Considering this and the difference in the yields 
of the polypeptides, described in Example for Reference, it can 
be presumed: Even if DNAs could be substantially equivalent in 
encoding the same amino acid sequence, not only the expressing 
efficiencies of the DNAs may differ, but the products expressed 
by them may significantly differ in their biological activities 
as a result of post-translational modifications by intracellular 
enzymes, depending on types of the DNAs and their hosts; (a) one 
type is used a transformant formed by introducing a DNA, which 

is a cDNA, into a microorganisms as a host, and (b) other type 

is used a transformant formed by introducing the present genomic 

DNA into a mammalian cell as a host. 

Example 4-2(b) 

Molecular weight 
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SDS-polyacrylamide gel electrophoresis of the 
polypeptide in Example 4-1 in the presence of 2 w/v % 
dithiothreitol as a reducing agent, according to the method 
reported by U. K. Laemli et al., in "Nature", Vol.227, pp. 680- 
685 (1970), exhibited a main band of a protein capable of 
inducing IFN-y in a position corresponding to a molecular weight 
of about 18,000-19,500 daltons. The molecular weight makers 
used in the analysis were bovine serum albumin (67,000 daltons), 
ovalbumin (45,000 daltons), carbonic anhydrase (30,000 daltons), 
soy bean trypsin inhibitor (20,100 daltons) and a-lactoalbumin 
(14,000 daltons) . 
Example 4-2 (c) 

N-Terminal amino acid sequence 

Conventional analysis using "MODEL 473A" , a protein 
sequencer commercialized by Perkin-Elmer Corp., Norwalk, USA, 
revealed that the polypeptide in Example 4-1 had the amino acid 
sequence of SEQ ID NO: 15 in the N-terminal region. 

Judging collectively from this result as well as the 
information that SDS-polyacrylamide gel electrophresis exhibited 
a main band in a position corresponding to a molecular weight 
of about 18,000-19,500 daltons, and that the molecular weight 
calculated from the amino acid sequence of SEQ ID N0:1 was 
18,199 daltons, it can be concluded that the polypeptide in 
Example 4-1 has the amino acid sequence of SEQ ID NO: 6. 

As is described above, the present invention is made 
based on the" identification of a genomic DNA encoding the 
polypeptide which induces the production of IFN-y by 
immunocompetent cells. The present genomic DNA efficiently 
express the present polypeptide when introduced into mammalian 
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host cells. The polypeptide features higher biological 
activities than that obtained by the cDNA expression in 
Escherichia coll. Therefore, the present genomic DNA is useful 
for the recombinant DNA techniques to prepare the polypeptide 
capable of inducing IFN-y production by immunocompetent cells. 
The present genomic DNA is useful to gene therapy for diseases 
including viral diseases, bacterial-infectious diseases, 
malignant tumors and immunopathies . 

Thus, the present invention is a significant invention 
which has a remarkable effect and gives a great contribution to 
this field. 

While there has been described what is at present 
considered to be the preferred embodiments of the present 
invention, it will be understood the various modifications may 
be made therein, and it is intended to cover in the appended 
claims all such modifications as fall within the true spirits 
and scope of the invention. 
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WHAT IS CLAIMED IS: 

1. A composition comprising an isolated DNA molecule 
comprising a nucleotide sequence encoding the amino acid 
sequences shown in SEQ ID NO : 1 , where Xaa is isoleucine or 
threonine, and a carrier capable of introducing the isolated DNA 
molecule into a mammalian cell, wherein said nucleotide sequence 
consists of the sequence of a fragment of human genomic DNA. 

2. A method for treating IFN-y and/or killer cell- 
susceptive diseases using gene therapy, comprising administering 
the composition according to claim 1 to a subject in need 
thereof . 

3. A method for treating tumors using gene therapy, 
comprising the steps of: 

transforming tumor cells obtained from a subject in 
need thereof with the composition according to claim 1; 

proliferating the transformed tumor cells ex vivo; and 

transplanting the proliferated transformed tumor cells 
into the subject in need thereof. 

4. The composition according to claim 1, wherein the 
nucleotide sequence comprises an exon having the sequence shown 
in SEQ ID NO: 3, 4, 5, 6, or 7. 

5. The composition according to claim 1, wherein the 
nucleotide sequence comprises an intron having the sequence shown 
in SEQ ID NO : 8 , 9, 10, 11, or 12 . 

6. The composition according to claim 1, wherein the 
nucleotide sequence is the sequence shown in SEQ ID NO: 13, 14, or 
15 . 
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7. The composition according to claim 1, wherein the 
carrier is a virus or liposome. 

8. A method for treating IFN-y and/or killer cell- 
susceptive diseases using gene therapy, comprising administering 
the composition according to claim 7 to a subject in need 
thereof . 

9. A method for treating tumors using gene therapy, 
comprising the steps of: 

transforming tumor cells obtained from a subject in 
need thereof with the composition according to claim 7; 

proliferating the transformed tumor cells ex vivo; and 

transplanting the proliferated transformed tumor cells 
into the subject in need thereof. 

10. The composition according to claim 1, wherein the 
isolated DNA molecule is linked with a heterologous nucleotide 
sequence . 

11. A method for treating IFN-y and/or killer cell- 
susceptive diseases using gene therapy, comprising administering 
the composition according to claim 10 to a subject in need 
thereof . 

12 . A method for treating tumors using gene therapy, 
comprising administering the steps of: 

transforming tumor cells obtained from a subject in 
need thereof with the composition according to claim 10; 

proliferating the transformed tumor cells ex vivo; and 

transplanting the proliferated transformed tumor cells 
into the subject in need thereof. 
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13. The composition according to claim 6, wherein the 
heterologous nucleotide sequence is of a virus vector. 

14. A method for treating IFN-y and/or killer cell- 
susceptive diseases using gene therapy, comprising administering 
the composition according to claim 13 to a subject in need 
thereof . 

15 . A method for treating tumors using gene therapy, 
comprising the steps of: 

transforming tumor cells obtained from a subject in 
need thereof with the composition according to claim 13; 

proliferating the transformed tumor cells ex vivo; and 

transplanting the proliferated transformed tumor cells 
into the subject in need thereof. 

16. A method for treating IFN-y- and/or killer cell- 
susceptive diseases using gene therapy, comprising administering 
to a subject in need thereof an isolated DNA molecule comprising 
a nucleotide sequence encoding the amino acid sequence shown in 
SEQ ID N0:1, where Xaa is isoleucine or threonine, wherein the 
nucleotide sequence consists of the sequence of a fragment of 
human genomic DNA. 

17. A method for treating tumors using gene therapy, 
comprising the steps of : 

transforming tumor cells obtained from a subject in 
need thereof with an isolated DNA molecule comprising a 
nucleotide sequence encoding the amino acid sequence shown in SEQ 
ID N0:1, where Xaa is isoleucine or threonine, wherein the 
nucleotide sequence consists of the sequence of a fragment of 
human genomic DNA; 
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proliferating the transformed tumor cells ex vivo; and 

transplanting the proliferated transformed tumor cells 
into the subject in need thereof. 
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Abstract of the Disclosure 

Disclosed is a genomic DNA encoding a polypeptide 
capable of inducing the production of interferon- y by 
immunocompetent cells. The genomic DNA efficiently expresses 
the polypeptide with high biological activities of such as 
inducing the production of interferon- y by immunocompetent 
cells, enhancing killer cells 1 cytotoxicity and inducing killer 
cells' formation, when introduced into mammalian host cells. 
The high biological activities of the polypeptide facilitate its 
uses to treat and/or prevent malignant tumors, viral diseases, 
bacterial infectious diseases and immune diseases without 
serious side effects when administered to humans. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Takanori OKURA 
Kakuji TORIGOE 
Masahi KURIMOTO 

(ii) TITLE OF INVENTION: GENOMIC DNA ENCODING A POLYPEPTIDE CAPABLE 

INDUCING THE PRODUCTION OF INTERFERON- y 

(iii) NUMBER OF SEQUENCES: 35 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: B ROWDY AND NEIMARK 

(B) STREET: 419 Seventh Street, N.W., Suite 300 

(C) CITY: Washington 

(D) STATE: D.C. 

(E) COUNTRY: USA 

(F) ZIP : 20004 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patent In Release #1.0, Version #1.3 0 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: JP 185,305/96 

(B) FILING DATE: 27-JUN-1996 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: BROWDY, Roger L. 

(B) REGISTRATION NUMBER: 25,618 

(C) REFERENCE /DOCKET NUMBER: OKURA=l 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 202-628-5197 

(B) TELEFAX: 202-737-3528 

(2) INFORMATION FOR SEQ ID NO : 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 157 amino acids 

(B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 



Tyr 


Phe 


Gly 


Lys 


Leu 


Glu 


Ser 


Lys 


Leu 


Ser 


Val He 


Arg 


Asn 


Leu 


Asn 


l 








5 










10 








15 




Asp 


Gin 


Val 


Leu 


Phe 


He 


Asp 


Gin 


Gly 


Asn 


Arg Pro 


Leu 


Phe 


Glu 


Asp 






20 










25 








30 






Met 


Thr 


Asp 


Ser 


Asp 


Cys 


Arg 


Asp 


Asn 


Ala 


Pro Arg 


Thr 


He 


Phe 


He 






35 










40 








45 








lie 


Ser 


Met 


Tyr 


Lys 


Asp 


Ser 


Gin 


Pro 


Arg Gly Met 


Ala 


Val 


Thr 


He 




50 








55 








60 










Ser 


Val 


Lys 


Cys 


Glu 


Lys 


He 


Ser 


Xaa 


Leu 


Ser Cys 


Glu 


Asn 


Lys 


He 


65 






70 










75 








80 


He 


Ser 


Phe 


Lys 


Glu 


Met 


Asn 


Pro 


Pro 


Asp 


Asn He 


Lys 


Asp 


Thr 


Lys 








85 










90 








95 




Ser 


Asp 


He 


He 


Phe 


Phe 


Gin 


Arg 


Ser 


Val 


Pro Gly His Asp 


Asn 


Lys 






100 










105 








110 






Met 


Gin 


Phe 


Glu 


Ser 


Ser 


Ser 


Tyr 


Glu 


Gly 


Tyr Phe 


Leu 


Ala 


Cys 


Glu 



115 120 125 
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Lys Glu Arg Asp Leu Phe Lys Leu He Leu Lys Lys Glu Asp Glu Leu 

130 135 140 

Gly Asp Arg Ser He Met Phe Thr Val Gin Asn Glu Asp 
145 150 155 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1120 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: No 

(iv) ANTI- SENSE: No 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 
(F) TISSUE TYPE: liver 

(iX) FEATURE: 

(A) NAME / KEY : 5 ' UTR 

(B) LOCATION: 1..177 

(C) IDENTIFICATION METHODS : E 

(A) NAME / KEY : leader peptide 

(B) LOCATION: 178.. 285 

(C) IDENTIFICATION METHODS: S 

(A) NAME /KEY : mat peptide 

(B) LOCATION: 286.. 756 

(C) IDENTIFICATION METHODS: S 

(A) NAME / KEY : 3 ' UTR 

(B) LOCATION: 757.. 1120 

(C) IDENTIFICATION METHODS : E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

GCCTGGACAG TCAGCAAGGA ATTGTCTCCC AGTGCATTTT GCCCTCCTGG CTGCCAACTC 6 0 
TGGCTGCTAA AGCGGCTGCC ACCTGCTGCA GTCTACACAG CTTCGGGAAG AGGAAAGGAA 12 0 
CCTCAGACCT TCCAGATCGC TTCCTCTCGC AACAAACTAT TTGTCGCAGG AATAAAG 17 7 



ATG 


GCT 


GCT 


GAA 


CCA 


GTA 


GAA 


GAC 


AAT 


TGC 


ATC 


AAC 


TTT 


GTG 


GCA 


ATG 


225 


Met 


Ala 


Ala 


Glu 


Pro 


Val 


Glu 


Asp 


Asn 


Cys 


He 


Asn 


Phe 


Val 


Ala 


Met 






-35 










-30 










-25 












AAA 


TTT 


ATT 


GAC 


AAT 


ACG 


CTT 


TAC 


TTT 


ATA 


GCT 


GAA 


GAT 


GAT 


GAA 


AAC 


273 


Lys 


Phe 


He 


Asp 


Asn 


Thr 


Leu 


Tyr 


Phe 


He 


Ala 


Glu 


Asp 


Asp 


Glu 


Asn 




-20 










-15 










-10 










-5 




CTG 


GAA 


TCA 


GAT 


TAC 


TTT 


GGC 


AAG 


CTT 


GAA 


TCT 


AAA 


TTA 


TCA 


GTC 


ATA 


321 


Leu 


Glu 


Ser 


Asp 


Tyr 


Phe 


Gly Lys 


Leu 
5 


Glu 


Ser 


Lys 


Leu 


Ser 
10 


Val 


He 




AGA 


AAT 


TTG 


AAT 


1 

GAC 


CAA 


GTT 


CTC 


TTC 


ATT 


GAC 


CAA 


GGA 


AAT 


CGG 


CCT 


369 


Arg 


Asn 


Leu 


Asn 


Asp 


Gin 


Val 


Leu 


Phe 


He 


Asp 


Gin 


Gly 


Asn 


Arg 


Pro 






15 








20 










25 










CTA 


TTT 


GAA 


GAT 


ATG 


ACT 


GAT 


TCT 


GAC 


TGT 


AGA 


GAT 


AAT 


GCA 


CCC 


CGG 


417 


Leu 


Phe 


Glu 


Asp 


Met 


Thr 


Asp 


Ser 


Asp 


Cys 


Arg 


Asp 


Asn 


Ala 


Pro 


Arg 






30 










35 










40 












ACC 


ATA 


TTT 


ATT 


ATA 


AGT 


ATG 


TAT 


AAA 


GAT 


AGC 


CAG 


CCT 


AGA 


GGT 


ATG 


465 


Thr 


lie 


Phe 


He 


He 


Ser 


Met 


Tyr 


Lys 


Asp 


Ser 


Gin 


Pro 


Arg 


Gly 


Met 




45 










50 










55 










60 




GCT 


GTA 


ACT 


ATC 


TCT 


GTG 


AAG 


TGT 


GAG 


AAA 


ATT 


TCA 


AYT 


CTC 


TCC 


TGT 


513 


Ala 


Val 


Thr 


He 


Ser 


Val 


Lys 


Cys 


Glu 


Lys 


He 


Ser 


Xaa 


Leu 


Ser 


Cys 












65 








70 










75 






GAG 


AAC 


AAA 


ATT 


ATT 


TCC 


TTT 


AAG 


GAA 


ATG 


AAT 


CCT 


CCT 


GAT 


AAC 


ATC 


561 


Glu 


Asn 


Lys 


He 


He 


Ser 


Phe 


Lys 


Glu 


Met 


Asn 


Pro 


Pro 


Asp 


Asn 


He 








80 










85 










90 








AAG 


GAT 


ACA 


AAA 


AGT 


GAC 


ATC 


ATA 


TTC 


TTT 


CAG 


AGA 


AGT 


GTC 


CCA 


GGA 


609 
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Lys Asp Thr Lys Ser Asp He He Phe Phe Gin Arg Ser Val Pro Gly 
95 100 105 



CAT 


GAT 


AAT 


AAG 


ATG 


CAA 


TTT 


GAA 


TCT 


TCA 


TCA 


TAC 


GAA 


GGA 


TAC 


His 


Asp 
110 


Asn 


Lys 


Met 


Gin 


Phe 
115 


Glu 


Ser 


Ser 


Ser 


Tyr 
120 


Glu 


Gly 


Tyr 


CTA 


GCT 


TGT 


GAA 


AAA 


GAG 


AGA 


GAC 


CTT 


TTT 


AAA 


CTC 


ATT 


TTG 


AAA 


Leu 


Ala 


Cys 


Glu 


Lys 


Glu 


Arg 


Asp 


Leu 


Phe 


Lys 


Leu 


He 


Leu 


Lys 


125 








130 










135 








AAC 


GAG 


GAT 


GAA 


TTG 


GGG 


GAT 


AGA 


TCT 


ATA 


ATG 


TTC 


ACT 


GTT 


CAA 


Glu 


Asp 


Glu 


Leu 


Gly Asp 


Arg 


Ser 


He 


Met 


Phe 


Thr 


Val 


Gin 


Asn 








145 










150 










155 



140 



657 



705 
753 
806 

"o-KU ihULlrtlim miiiv,niuv uvj *w - — 

Asp 

GCCCTTTGGG AGGCTGAGGC GGG C AG AT C A CCAGAGGTCA GGTGTTCAAG ACCAGCCTGA 866 
CCAACATGGT GAAACCTCAT CTCTACTAAA AATACTAAAA ATTAGCTGAG TGTAGTGACG 926 
CATGCCCTCA ATCCCAGCTA CTCAAGAGGC TGAGGCAGGA GAATCACTTG CACTCCGGAG 98 6 
GTAGAGGTTG TGGTGAGCCG AG AT T G C AC C ATTGCGCTCT AGCCTGGGCA AC AAC AG CAA 104 6 
AACTCCATCT CAAAAAATAA AATAAATAAA TAAACAAATA AAAAATTCAT AATGTGAAAA 1106 
AAAAAAAAAA AAAA 1120 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME / KEY : exon 

(B) LOCATION: 1..135 

(C) IDENTIFICATION METHODS: S 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AA AAC CTG GAA TCA GAT TAC TTT GGC AAG CTT GAA TCT AAA TTA TCA 47 
Glu Asn Leu Glu Ser Asp Tyr Phe Gly Lys Leu Glu Ser Lys Leu Ser 

-5 15 10 

GTC ATA AGA AAT TTG AAT GAC CAA GTT CTC TTC ATT GAC CAA GGA AAT 95 
Val He Arg Asn Leu Asn Asp Gin Val Leu Phe He Asp Gin Gly Asn 

15 20 25 

CGG CCT CTA TTT GAA GAT ATG ACT GAT TCT GAC TGT AGA G 13 5 

Arg Pro Leu Phe Glu Asp Met Thr Asp Ser Asp Cys Arg Asp 
30 35 40 

(2) INFORMATION FOR SEQ ID NO : 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 134 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 
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(iX) FEATURE: 

(A) NAME /KEY : exon 

(B) LOCATION: 1. .134 

(C) IDENTIFICATION METHODS: S 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



AT 


AAT GCA 


CCC 


CGG 


ACC 


ATA 


TTT 


ATT 


ATA 


AGT 


ATG 


TAT 


AAA 


GAT 


AGC 
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Phe 


He 


He 


Ser 
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Tyr 


Lys 


Asp 


Ser 
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AAA 


ATT 


Gin 


Pro Arg 
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Val 


Thr 


He 
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Val 


Lys 


Cys 


Glu 


Lys 


He 
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70 




TCA 


ACT CTC 


TCC 


TGT 


GAG 


AAC 


AAA 


ATT 
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TCC 


TTT 


AAG 








Ser 


Thr Leu 


Ser 


Cys 


Glu 


Asn 


Lys 


He 


He 
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Phe 
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80 










85 
















(2) 


INFORMATION 


FOR 


SEQ 


ID NO: 


5: 

















(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1. .87 

(C) IDENTIFICATION METHODS: S 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GAATAAAG ATG GCT GCT GAA CCA GTA GAA GAC AAT TGC ATC AAC TTT GTG 5 0 
Met Ala Ala Glu Pro Val Glu Asp Asn Cys He Asn Phe Val 
-35 -30 -25 

GCA ATG AAA TTT ATT GAC AAT ACG CTT TAC TTT ATA G 8 7 

Ala Met Lys Phe He Asp Asn Thr Leu Tyr Phe He Ala 
-20 -15 -10 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 12 base pairs 
{B} TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1. .87 

(C) IDENTIFICATION METHODS: S 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
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CT GAA GAT GAT G 
Ala Glu Asp Asp Glu 
-10 



12 



(2) INFORMATION FOR SEQ ID NO : 7: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2167 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME / KEY : exon + 3'UTR 

(B) LOCATION: 1..2167 

(C) IDENTIFICATION METHODS : E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7: 



GAA 


ATG 


AAT 


CCT 


CCT 


GAT 


AAC 


ATC 


AAG 


GAT 


ACA 


AAA 


AGT GAC ATC 


ATA 


48 


Glu 


Met 


Asn 


Pro 


Pro 


Asp 


Asn 


He 


Lys 


Asp 


Thr 


Lys 


Ser Asp He 


He 




85 










90 










95 






100 




TTC 


TTT 


CAG 


AGA 


AGT 


GTC 


CCA 


GGA 


CAT 


GAT 


AAT 


AAG 


ATG CAA TTT 


GAA 


96 


Phe 


Phe 


Gin 


Arg 


Ser 
105 


Val 


Pro 


Gly 


His 


Asp 
110 


Asn 


Lys 


Met Gin Phe 
115 


Glu 




TCT 


TCA 


TCA 


TAC 


GAA 


GGA 


TAC 


TTT 


CTA 


GCT 


TGT 


GAA 


AAA GAG AGA 


GAC 


144 


Ser 


Ser 


Ser 


Tyr 
120 


Glu 


Gly 


Tyr 


Phe 


Leu 
125 


Ala 


Cys 


Glu 


Lys Glu Arg 
130 


Asp 




CTT 


TTT 


AAA 


CTC 


ATT 


TTG 


AAA 


AAA 


GAG 


GAT 


GAA 


TTG 


GGG GAT AGA 


TCT 


192 


Leu 


Phe 


Lys 

135 


Leu 


He 


Leu 


Lys 


Lys 

140 


Glu 


Asp 


Glu 


Leu 


Gly Asp Arg 
145 


Ser 




ATA 


ATG 


TTC 


ACT 


GTT 


CAA 


AAC 


GAA 


GAC 


TAGCTAT 


TAAAATTTCA TGCCGGGCGC 


246 


He 


Met 


Phe 


Thr 


Val 


Gin 


Asn 


Glu 


Asp 
















150 










155 

















AGTGGCTCAC GCCTGTAATC CCAGCCCTTT GGGAGGCTGA GGCGGGCAGA TCACCAGAGG 306 
TCAGGTGTTC AAG AC CAG CC TGACCAACAT GGTGAAACCT CATCTCTACT AAAAATACAA 3 66 
AAAATTAGCT GAGTGTAGTG ACCCATGCCC TCAATCCCAG CTACTCAAGA GGCTGAGGCA 4 26 
GGAGAATCAC TTGCACTCCG GAGGTGGAGG TTGTGGTGAG CCGAGATTGC ACCATTGCGC 4 86 
TCTAGCCTGG GCAACAACAG CAAAACTCCA TCTCAAAAAA TAAAATAAAT AAATAAACAA 546 
ATAAAAAATT CATAATGTGA ACTGTCTGAA TTTTTATGTT TAGAAAGATT ATGAGATTAT 6 06 
TAGTCTATAA TTGTAATGGT GAAATAAAAT AAATAC CAGT CTTGAAAAAC ATCATTAAGA 666 
AATGAATGAA C TTT C AC AAA AG C AAAC AAA CAGACTTTCC CTTATTTAAG TGAATAAAAT 726 
AAAATAAAAT AAAATAATGT TTAAAAAATT CATAGTTTGA AAACATTCTA CATTGTTAAT 7 86 
TGGCATATTA ATTATACTTA ATATAATTAT TTTTAAATCT TTTGGGTTAT TAGTCCTAAT 846 
GACAAAAGAT ATTGATATTT GAACTTTCTA ATTTTTAAGA ATATCGTTAA ACCATCAATA 9 06 
TTTTTATAAG GAGGCCACTT CACTTGACAA ATTTCTGAAT TTCCTCCAAA GT CAGT AT AT 966 
TTTTAAAATT CAGT TTG ATC CTGAATCCAG CAATATATAA AAGGGATTAT ATACTCTGGC 1026 
CAACTGACAT TCATCCTAGG AATGCAAAGA TGGTTTAATA TCCTAAAATC AATTAACATA 1086 
ACATACTATA TTAATAAAGT AT C AAAAC AG TATTCTCATC TTTTTTTCTT TTTTCACAAT 114 6 
TCCTTGGTTA CACTATCATC TCAATAGATG CAGAAAAAGC ATT TG AC AAA ATCCAATTCA 12 06 
TAATAAAAAT TCTCAAACTT GAAAGAGAAC ATCATAAAGG CAT C TAT GAA AAAC CTAC AG 126 6 
CTAATATCAT ACTTAACGAT GAAAAACTGA ATTATTTTAC CCTAAGATCA AGAATAATGC 1326 
AAGCATGTCA GCTCTTGCAA CTTCTATTCA ACATTGTACT GGAGGTTCTA GCCAGAGCAA 13 86 
C CAT AC AAT A AATAAAAATA AAAGGCACCC AGATTAGAAA GGAAGTCTTT ATTTGCAGAC 1446 
AACATGGTTC TT T ATG CAG A AAACCGTCAG GAATACACAC ACATGTTAGA ACTAATAAGT 1506 
TCAGCAAGGT TGCAGGTTGC AATATCAATA TG C AAAAAT A CATTGAAGGC TGGGCTCAGT 1566 
GGAGATGGCA TGTACCTTTC GTCCCAGCTA CTTGGGAGGC TGAGGTAGGA GGAT CACTTG 1626 
AGGTGAGGAG TTTGAGGCTA TAGTGCAATG TGATCTTGCC TGTGAATAGC CACTGCACTC 1686 
GAGCCTAGGC AACAAAGTGA GACCCCGTCT CCAAAAAAAA AAATGGTATA TTGGTATTTC 174 6 
TGTATATGAA CAATGAATGA TCTGAAAACA AGAAAATTCC ATTCACGATG GTATTAAAAA 18 06 
AATAAAATAC AAATAAATTT AGCAAAATAA TTATAAAACT TGTACATCGA AAATTTCAAA 1866 
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GC ACTCTGAG GGAAATTAAA GATGATCTAA ATAATTGGAG AGACACTCTA TGATCACTGA 1926 
TTGGAAAATT CATTCAATAT TGTTAAGATA ACAATTGTCC CCAAATTGAT GCATGCATTC 1986 
AATTTAGTCT TCATCAAAAT TCCAGCAGGG TTTTTGCAGA AATTGACAAG CTGTACCCAA 2 046 
AATGTATATG GAAATGAAAA GACCCAGAAG AGCAAATAAT TTTTTAAAAA CAAAGTTGGA 2106 
AAACTTTTAC TTCCTAATTT TAAAACTTAC TATAAACCTA AAGTTATCAA GACCATTTAG 2166 

2167 



(2) INFORMATION FOR SEQ ID NO : 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1334 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME /KEY : intron 

(B) LOCATION: 1..1334 

(C) IDENTIFICATION METHODS: E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GTATTTTTTT TAATTCGCAA ACATAGAAAT GACTAGCTAC TTCTTCCCAT TCTGTTTTAC 60 
TGCTTACATT GTTCCGTGCT AGTCCCAATC CTCAGATGAA AAGT CACAGG AGTGACAATA 120 
ATTTCACTTA CAGGAAACTT TATAAGGCAT CCACGTTTTT TAGTTGGGGT AAAAAATTGG 18 0 
ATACAATAAG ACATTGCTAG GGGTCATGCC TCTCTGAGCC TGCCTTTGAA TCACCAATCC 240 
CTTTATTGTG ATTGCATTAA CTGTTTAAAA CCTCTATAGT TGGATGCTTA ATCCCTGCTT 3 00 
GTTACAGCTG AAAATGCTGA TAGTTTACCA GGTGTGGTGG CATC TAT CTG TAATCCTAGC 360 
TACTTGGGAG GCTCAAGCAG GAGGATTGCT TGAGGCCAGG ACTTTGAGGC TGTAGTACAC 42 0 
TGTGATCGTA CCTGTGAATA GCCACTGCAC TCCAGCCTGG GTGATATACA GACCTTGTCT 4 80 
CTAAAA TTAA AAAAAAAAAA AAAAAAAACC TTAGGAAAGG AAATTG AT C A AGTCTACTGT 540 
GCCTTCCAAA ACATGAATTC CAAATATCAA AGTTAGGCTG AGTTGAAGCA GTGAATGTGC 6 00 
ATTCTTTAAA AATACTGAAT ACT-TACCTTA ACATATATTT TAAATATTTT ATTTAGCATT 66 0 
TAAAAGTTAA AAACAATCTT TTAGAATTCA TATCTTTAAA ATACTCAAAA AAGTTGCAGC 72 0 
GTGTGTGTTG TAATACACAT TAAACTGTGG GGTTGTTTGT TTGTTTGAGA TGCAGTTTCA 780 
CTCTGTC£CC CAGGCTGAAG TGCAGTGCAG TGCAGTGGTG TGATCTCGGC TCACTACAAC 84 0 
CTCCACCTCC CACGTTCAAG CGATTCTCAT GCCTCAGTCT CCCGAGTAGG TGGGATTACA 90 0 
GGCATGCACC ACTTACACCC GGCTAATTTT TGTATTTTTA GTAGAGCTGG GGTTTCACCA 96 0 
TGTTGGCCAG GCTGGTCTCA AACCCCTAAC CTCAAGTGAT CTGCCTGCCT CAGCCTCCCA 102 0 
AACAAACAAA CAACCCCACA GTTTAATATG TGTTACAACA CACATGCTGC AACTTTTATG 10 80 
AGTATTTTAA TGATATAGAT TATAAAAGGT TGTTTTTAAC TTTTAAATGC TGGGATTACA 1140 
GGCATGAGCC ACTGTGCCAG GCCTGAACTG TGTTTTTAAA AATGTCTGAC CAGCTGTACA 12 00 
TAGTCTCCTG CAGACTGGCC AAGTCTCAAA GTGGGAACAG GTGTATTAAG G AC TAT C C TT 12 6 0 
TGGTTAAATT TCCGCAAATG TTCCTGTGCA AGAATTCTTC TAACTAGAGT TCTCATTTAT 1320 
TATATTTATT TCAG 13 34 

(2) INFORMATION FOR SEQ ID NO : 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 73 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 
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(A) NAME / KEY : intron 

(B) LOCATION: 1..4773 

(C) IDENTIFICATION METHODS: E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9: 

GTAAGACTGA GCCTTACTTT GTTTTCAATC ATGTTAATAT AATCAATATA ATTAGAAATA 6 0 
TAACATTATT TCTAATGTTA ATATAAGTAA TGTAATTAGA AAACTCAAAT ATCCTCAGAC 120 
CAACCTTTTG TCTAGAACAG AAATAACAAG AAGCAGAGAA CCATTAAAGT GAATACTTAC 180 
TAAAAATTAT CAAACTCTTT ACCTATTGTG ATAATGATGG TTTTTCTGAG CCTGTCACAG 24 0 
GGGAAGAGGA GATACAACAC TTGTTTTATG ACCTGCATCT CCTGAACAAT CAGTCTTTAT 30 0 
ACAAATAATA ATGTAGAATA CATATGTGAG TTATACATTT AAGAATAACA TGTGACTTTC 36 0 
CAGAATGAGT TCTGCTATGA AGAATGAAGC TAATTATCCT TCTATATTTC TACACCTTTG 42 0 
TAAATTATGA TAATATTTTA ATCCCTAGTT GTTTTGTTGC TGATCCTTAG CCTAAGTCTT 48 0 
AGACACAAGC TTCAGCTTCC AGTTGATGTA TGTTATTTTT AATGTTAATC TAATTGAATA 54 0 
AAAGTT ATG A GATCAGCTGT AAAAGTAATG CTATAATTAT CTTCAAGCCA GGTATAAAGT 6 00 
ATTTCTGGCC TCTACTTTTT CTCTATTATT CTCCATTATT ATTCTCTATT ATTTTTCTCT 660 
ATTTCCTCCA T T AT TG T TAG ATAAACCACA ATTAACTATA GCTACAGACT GAGCCAGTAA 72 0 
GAGTAGCCAG GGATGCTTAC AAATTGGCAA TGCTTCAGAG GAGAATTCCA TGTCATGAAG 780 
ACTCTTTTTG AGTGGAGATT TGCCAATAAA TATCCGCTTT CATGCCCACC CAGTCCCCAC 84 0 
TGAAAG AC AG TTAGGATATG ACCTTAGTGA AGGTACCAAG GGGCAACTTG GTAGGGAGAA 900 
AAAAGCCACT C T AAAAT AT A ATCCAAGTAA GAACAGTGCA TATGCAACAG ATACAGCCCC 960 
CAGACAAATC CCTCAGCTAT CTCCCTCCAA CCAGAGTGCC ACCCCTTCAG GTGACAATTT 1020 
GGAGTCCCCA TTCTAGACCT GACAGGCAGC TTAGTTATCA AAAT AG C AT A AGAGGCCTGG 1080 
GATGGAAGGG TAGGGTGGAA AGGGTTAAGC ATGCTGTTAC TGAACAACAT AATTAGAAGG 1140 
GAAGGAGATG GCCAAGCTCA AGCTATGTGG GATAGAGGAA AACTCAGCTG CAGAGGCAGA 1200 
TTCAGAAACT GGGATAAGTC CGAACCTACA GGTGGATTCT TGTTGAGGGA GACTGGTGAA 12 6 0 
AATGTTAAGA AG ATG G AAAT AATGCTTGGC ACTTAGTAGG AACTGGGCAA ATCCATATTT 1320 
GGGGGAGCCT GAAGTTTATT CAATTTTGAT GGCCCTTTTA AATAAAAAGA ATGTGGCTGG 13 8 0 
GCGTGGTGGC TCACACCTGT AATCCCAGCA CTTTGGGAGG CCGAGGGGGG CGGATCACCT 144 0 
GAAGTCAGGA GTTCAAGACC AG C C T G AC C A ACATGGAGAA ACCCCATCTC T AC T AAAAAT 1500 
AC AAAATT AG CTGGGCGTGG TGGCATATGC CTGTAATCCC AGCTACTCGG GAGGCTGAGG 1560 
CAGGAGAATC TTTTGAACCC GGGAGGCAGA GGTTGCGATG AGCCTAGATC GTGCCATTGC 1620 
ACTCCAGCCT GGGCAACAAG AG CAAAACT C GGTCTCAAAA AAAAAAAAAA AAAAGTGAAA 16 8 0 
TTAACCAAAG GCATTAGCTT AATAATTTAA TACTGTTTTT AAGTAGGGCG GGGGGTGGCT 1740 
GGAAGAGATC TGTGTAAATG AGGGAATCTG ACATTTAAGC TTCATCAGCA T CAT AG C AAA 18 00 
TCTGCTTCTG GAAGGAACTC AATAAATATT AGTTGGAGGG GGGGAGAGAG TGAGGGGTGG 186 0 
ACTAGGACCA GTTTTAGCCC TTGTCTTTAA TCCCTTTTCC TGCCACTAAT AAGGATCTTA 192 0 
GCAGTGGTTA TAAAAGTGGC CTAGGTTCTA GATAATAAGA TACAACAGGC CAGGCACAGT 1980 
GGCTC^TGCC TATAATCCCA GCACTTTGGG AGGGCAAGGC GAGTGTCTCA CTTGAGATCA 2 040 
GGAGTTCAAG ACCAGCCTGG CCAGCATGGC GATACTCTGT CTCTACTAAA AAAAATACAA 2100 
AAATTAGCCA GGCATGGTGG CATGCACCTG TAATCCCAGC TACTCGTGAG CCTGAGGCAG 216 0 
AAGAATCGCT TGAAACCAGG AGGTGTAGGC TGCAGTGAGC TGAGATCGCA CCACTGCACT 2220 
CCAGCCTGGG CGACAGAATG AGACTTTGTC TCAAAAAAAG AAAAAGATAC AACAGGCTAC 22 8 0 
CCTTATGTGC TCACCTTTCA CTGTTGATTA CTAGCTATAA AGTCCTATAA AGTTCTTTGG 2340 
TCAAGAACCT TGACAACACT AAGAGGGATT TGCTTTGAGA GGTTACTGTC AGAGTCTGTT 24 0 0 
T CAT AT AT AT ACATATACAT GTATATATGT AT C TAT AT C C AGGCTTGGCC AGGGTTCCCT 24 6 0 
CAGACTTTCC AGTGCACTTG GGAGATGTTA GGTCAATATC AACTTTCCCT GGATTCAGAT 252 0 
TCAACCCCTT CTGATGTAAA AAAAAAAAAA AAAAAGAAAG AAATCCCTTT CCCCTTGGAG 2580 
CACTCAAGTT TCACCAGGTG GGGCTTTCCA AGTTGGGGGT TCTCCAAGGT CATTGGGATT 264 0 
GCTTTCACAT C C ATTTGCT A TGTACCTTCC CTATGATGGC TGGGAGTGGT CAACATCAAA 2 7 00 
ACTAGGAAAG CTACTGCCCA AGGATGTCCT TACCTCTATT CTGAAATGTG CAATAAGTGT 276 0 
GATTAAAGAG ATTGCCTGTT CTACCTATCC ACACTCTCGC TTTCAACTGT AACTTTCTTT 2 82 0 
TTTTCTTTTT TTCTTTTTTT CTTTTTTTTT GAAACGGAGT CTCGCTCTGT CGCCCAGGCT 288 0 
AGAGTGCAGT GGCACGATCT CAGCTCACTG CAAGCTCTGC CTCCCGGGTT CACGCCATTC 2 94 0 
TCCTGCCTCA CCCTCCCAAG CAGCTGGGAC TACAGGCGCC TGCCACCATG CCCAGCTAAT 3 000 
TTTTTGTATT TTTAGTAGAG ACGGGGTTTC AC CGTGTT AG CCAGGATGGT CTCGATCTCC 3 06 0 
TGAACTTGTG ATCCGCCCGC CTCAGCCTCC CAAAGTGCTG GGATTACAGG CGTGAGCCAT 312 0 
CGCACCCGGC TCAACTGTAA CTTTCTATAC TGGTTCATCT TCCCCTGTAA T G T T AC TAG A 318 0 
GCTTTTGAAG TTTTGGCTAT GGATTATTTC TCATTTATAC ATTAGATTTC AGATTAGTTC 3240 
CAAATTGATG CCCACAGCTT AGGGTCTCTT CCTAAATTGT ATATTGTAGA CAGCTGCAGA 3 3 00 
AGTGGGTGCC AATAGGGGAA CTAGTTTATA CTTTCATCAA CTTAGGACCC ACACTTGTTG 336 0 
ATAAAGAACA AAGGTCAAGA GTTATGACTA CTGATTCCAC AACTGATTGA GAAGTTGGAG 342 0 
ATAACCCCGT GACCTCTGCC ATCCAGAGTC TTTCAGGCAT CTTTGAAGGA TGAAGAAATG 3480 
CTATTTTAAT TTTGGAGGTT TCTCTATCAG TGCTTAGGAT CATGGGAATC TGTGCTGCCA 3 540 
TGAGGCCAAA ATTAAGTCCA AAACATCTAC TGGTTCCAGG ATTAACATGG AAGAACCTTA 3 600 
GGTGGTGCCC ACATGTTCTG ATCCATCCTG CAAAATAGAC ATGCTGCACT AACAGGAAAA 366 0 
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GTGCAGGCAG CACTACCAGT TGGATAACCT GCAAGATTAT AGTTTCAAGT AATCTAACCA 3 72 0 
TTTCTCACAA GGCCCTATTC TGTGACTGAA ACATACAAGA ATCTGCATTT GGCCTTCTAA 3 780 
GGCAGGGCCC AGCCAAGGAG ACCATATTCA GGACAGAAAT TCAAGACTAC TATGGAACTG 3 840 
GAGTGCTTGG CAGGGAAGAC AGAGTCAAGG ACTGCCAACT GAGCCAATAC AGCAGGCTTA 3 900 
CACAGGAACC CAGGGCCTAG CCCTACAACA ATTATTGGGT CTATTCACTG TAAGTTTTAA 3 960 
TTTCAGGCTC CACTGAAAGA GTAAGCTAAG ATTCCTGGCA CTTTCTGTCT CTCTCACAGT 4 02 0 
TGGCTCAGAA ATGAGAACTG GTCAGGCCAG GCATGGTGGC TTACACCTGG AATCCCAGCA 40 8 0 
CTTTGGGAGG CCGAAGTGGG AGGGTCACTT GAGGCCAGGA GTTCAGGACC AGCTTAGGCA 414 0 
ACAAAGTGAG ATACCCCGTG ACCCCTTCTC TACAAAAATA AATTTTAAAA ATTAGCCAAA 42 0 0 
TGTGGTGGTG TATACTTACA GTCCCAGCTA CTCAGGAGGC TGAGGCAGGG GGATTGCTTG 4260 
AGCCCAGGAA TTCAAGGCTG CAGTGAGCTA TGATTTCACC ACTGCACTTC TGGCTGGGCA 43 2 0 
ACAGAGCGAG ACCCTGTCTC AAAGCAAAAA GAAAAAGAAA CTAGAACTAG CCTAAGTTTG 4380 
TGGGAGGAGG TCATCATCGT CTTTAGCCGT GAATGGTTAT TATAGAGGAC AGAAATTGAC 444 0 
ATTAGCCCAA AAAGCTTGTG GTCTTTGCTG GAACTCTACT TAATCTTGAG CAAATGTGGA 45 0 0 
CACCACTCAA TGGGAGAGGA GAGAAGTAAG CTGTTTGATG TATAGGGGAA AACTAGAGGC 4 560 
CTGGAACTGA ATATGCATCC CATGACAGGG AGAATAGGAG ATTCGGAGTT AAGAAGGAGA 4 620 
GGAGGTCAGT ACTGCTGTTC AGAGATTTTT TTTATGTAAC TCTTGAGAAG CAAAACTACT 4 68 0 
TTTGTTCTGT TTGGTAATAT ACTTCAAAAC AAACTTCATA TATTCAAATT GTTCATGTCC 474 0 
TGAAATAATT AGGTAATGTT TTTTTCTCTA TAG 47 73 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8835 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME /KEY : intron 

(B) LOCATION: 1. .8835 

(C) IDENTIFICATION METHODS: E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GTAAGAAATA TCATTCCTCT TTATTTGGAA AGTCAGCCAT GGCAATTAGA GGTAAATAAG 60 

CTAGAAAGCA ATTGAGAGGA ATATAAACCA TCTAGCATCA CTACGATGAG CAGTCAGTAT 12 0 

CAACATAAGA AATATAAGCA AAGTCAGAGT AGAATTTTTT TCTTTTATCA GATATGGGAG 18 0 

AGTATCACTT TAGAGGAGAG GTTCTCAAAC TTTTTGCTCT CATGTTCCCT TTACACTAAG 24 0 

CACATCACAT GTTAGCATAA GTAACATTTT TAATTAAAAA TAACTATGTA CTTTTTTAAC 3 00 

AACAAAAAAA AGCATAAAGA GTGACACTTT TTTATTTTTA CAAGTGTTTT AACTGGTTTA 360 

AT AG AAG C C A TATAGAT CTG CTGGATTCTC ATCTGCTTTG CATTCAGACT ACTGCAATAT 42 0 

TGCACAGAAT GCAGCCTCTG GTAAACTCTG TTGTACACTC ATGAGAGAAT GGGTGAAAAA 48 0 

GACAAATTAC GTCTTAGAAT TATTAGAAAT AGCTTTCACT TTAGGAACTC CCTGAGAATT 54 0 

GCTGCTTTAG AGTGGTAAGA TAAATAAGCT TCTCTTTAAA CGGAATCTCA AG AC AG AAT C 6 00 

AGTTACATTA AAAGCAAACA AAAAATTTGC C CATGGTTAG TCATCTTGTG AAATCTGCCA 660 

CACCTTTGGA CTGGGCTACA ATTGGATAAT ATAGCATTCC CCGAGATAAT TTTCTCTCAC 72 0 

AATTAAGGAA AGGGCTGAAT AAATATCTCT GTTTGAAGTT GAATAACAAA AAT T AGG AC C 780 

CCCTAAATTT TAGGGCTCCT GAAATTCGTC TTTTTGCCTA TATTCAGCTA CTTTACGTTC 84 0 

TATTAAATCT TCTTTCAGGC CAGGTGCACT AGCTCATGCC TAGAATCTCA GGCAGGCCTG 900 

AGCCCAGGAA TTTGAGACCA GCCAGGGCAA CACAGTCTCT ACAAAAAAAT AAAAAATTAC 96 0 

CTGGGTGTGT TGGTGCATGC CTGTAGAACT ACTCAGGATG CTGAGGACTG CTTGAGCCCA 102 0 

GGATAGCCAA ATCTGTGGTG AGTTCAGCCA CTAAACAGAG CGAGACTTTC TCAAAAAAAC 108 0 

AAACAAAAAA ACAAACAAAC TTCCTTCAAA ATAACTTTTT ATCTGCAATG TTTTCCTATT 114 0 

GCCTGTGAGA TTAAATTTAC TCTTTTACCT GATTTCCAAA GCCCTCCATA ATCTAATCCG 12 0 0 

ACTTTACCTT GTGTTCACTG CAAAATAGCA GGACTGTTCC ACTACAATCC AAAAATCACA 126 0 

GGTTGGGTGC AGTGGCTCAC TCCTGTAATC CCAACACTTT GGAAGGCCAA GGCAGGTGGA 132 0 

TTGCTTCAGC TCAGGAGTTC AAGACCAGCC TGGGCAACAT GGCAAAAACC CTGTCTCTCC 13 8 0 

AAAACATACA AAAATTAGCC AGATGTGGTA GTATGTGCCT GTAGTCCCAA CTACTCAAAA 1440 

GGCTAAGGCA AGAGGATCAC TTGAGCCCAG GAGGTCAAGG CTACAGTGAG CCATGTTTAC 150 0 

TGTGTCACTG CACTCCAGCC TGGGTGATAG AG C AAG AC C A TGTCTCAAAA AAAAAAAAAA 156 0 
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GAAAAGAAAA GAAAAAAACA TCGCTCTATT CAGTTCACCC CCACCACAAC ATTGTTTTGA 1620 
TTATCACATA AATGCTGGTC CATTGCCTTC TCTATCTATT CAAATCTTTA AGCATTCTTT 16 80 
GAGATTCAAC TCAATTCTCC TTTTCAAACT AGGCCATTTA AACTACATCA GTTCCATTTT 1740 
GATTTTCTTG CTTTGAGTCT ACAGACTCAA AAACAAAAAC TTAAAAACTT ATTTTTTAAG 1800 
TTTTCTGCTA CTCTCACTTC TTCAACACTC ACATACACGC ATTCATAATA AGATGGCAGA 1860 
ATGTTCAAGG ATAAAATGAT TTATAGAACT GAAAAGTTAG GTTTTGATCT TGTTGCTGTC 1920 
AAGATGACTA CCTACCTGAT CTCAGGTAAT TAATTATGTA GCATGCTCCC TCATTTCATC 198 0 
C CAT AC C TAT TCAACAGGAT TGGAATTCCA CAGCAAGGAT AAACATAATC ATAGTTGCTT 2 04 0 
TTCAAGTTCA AGGCATTTTA ACTTTTAATC TAGTAGTATG TTTGTTGTTG TTGTTGTTGT 2100 
TTGAGATGGA GCCCTGCTGT GTCACCCAGG CTGGAGTGCA GTGGCACGAA CTCGGCTCAC 2160 
TGCAACCTCT GCCTCATGGG TTCAATCAGT TATTCTGCCT CAGTGTCCCA AGTAGCTGGG 22 20 
ACTACAAGGC ACATGCCACC ATGCCTGGCT AATTTTTGTA TTTTTAGTAG AAACAGGGCT 228 0 
TCACCATGTT GGCCAGGCTG GTCTCGAACT CCTGACCTCA AGTGATCCAG CCGCCTCGGC 2 34 0 
CTCCCAAAGT GCTGGGATTA CAGGCATAAG CCACCGTGCC CAGCCTAATA GTATGTTTTT 24 0 0 
AAACTCTTAG TGGCTTAACA ATGCTGGTTG TATAATAAAT ATGCCATAAA TATTTACTGT 2460 
CTTAGAATTA TGAAGAAGTG GTTACTAGGC CGTTTGCCAC ATATCAATGG TTCTCTCCTT 2 52 0 
ACAGCTTTAA TTAGAGTCTA GAATTGCAGG TTGGTAGAGC TGGAACAGAC CTTAAAGATT 2 58 0 
GACTAGCCAA CTTCCTTGTC CAAATGAGGG AACTGAGACC CTTAAAATTA AGTGACTTGC 2 64 0 
C C C AG AC AAA ACTGGAACTC ATGTGTCCTA ATTTCCATCA TGAAATTCTA C CAT TC ACTA 2700 
GCCTCTGGCT AGTTGTCAAA GTATTGCATA ACTAAATTTT TATGTCTGTT TTAAAGAACA 2 76 0 
AATTGTCACT GCTTACTCCT GGGAGGGTCT TTCTGAGGTG GTTTATAACT CTTAAAAAAA 282 0 
AAAAAGT C AG TAGTCTGAGA ATTTTAGACG AAATAGTCAA AGCATTTTTA TCCAATGGAT 2880 
CTATAATTTT CATAGATTAG AGTTAAATCA AAGAAACACG GATGAGAAAG GAAGAGGAAA 2 94 0 
ATTGAGGAGA GGAGGAATGG GGATGAGAAC ACACTACTTG TAATCAGTCA TAGATGTACT 3 00 0 
GAGAACTAAC AAGAAGAATT GTAAGAAAAT AAGAATGAAG AATTCAAAAT CAACACATGA 3 060 
AATAAAAAGA AACTACTAGG GAAAAATGGA GAAGACATTA GAAAAATTAT TCTATTTTTA 3120 
AAATT CTGTT TTCAGGCTTC CCTCCTGTTC TTCCTCCTTC TCATTGGTTT TCAGGTGGAG 318 0 
GGAAAGTTTA AGATGGAAAA AATATATATA TTCTACACAT CCCTTTCTAC GCTGTTGTCA 3 24 0 
TGGCAACAAG GTTTATCATA GCAAACTTTT ATTCATACAA CATTTATTGA GTTCTTACTG 3 3 00 
TGTGGTAAGC TCTTTCCAGG TGTTGAAAAT TCAGGGGAAA AAAGACAACT CATTGTCTTA 33 6 0 
AAACTCAGAT GAAAGCTGAA CAGACCTATT TTTAATCAAA GTAATCTCAA TTTAGGGTAG 342 0 
TAAGAGCTAT TTAAGAAGCA TGAACAGGTG TGAAGGAGGT AGGACTCTGA GGAGAGAATA 348 0 
GTTAGCTAGG AATGAAAGAG CAGAGAAGTT TTCCTAGAGG AAC T ATT AAA GCTGGGAGTT 354 0 
ACGGGATGAA AG AT GAG G C A GGGTTTGCAG GCAAAAAAAA AAAAAAGGCA GGGGAAGGGG 36 0 0 
AAGTTCTGGC CTGGCAGAGA GAATAACTGT GGCAACAATG GAGGAGAGTC TGGAAGCAAG 3660 
AAAACCAAGT AGAAGAGTAT TAAAATAGAA GATGCCAGGG GTAATGAGGG CTTGATTTAA 3 72 0 
AACAGTGCTG TTGGAGATGG AGAGGAGATA CCAAATTCTG GAGACATTTC TGAGTTAGAA 3 78 0 
CCTACAGTAT TTATCAGACA AGGGAAAGAT TAGACAAAGG AGTTAAGAAT GACTCCCAGG 384 0 
TTTCAGTTTG GGGCAGGTAA CTAGGACATG TTTTGAAAAG TAATGTATTG GATCTCTTAC 3 90 0 
CATTGGAACT ATGTATGTGG AGCCAAATTA AAATTTGTAC ATGTATATAA CTCTCCCCCC 3 96 0 
ACCACCAGTA ACTACTTCCC TAACTCTCTA CTTTGTAGCC AGACTTCCTA AAAGAATAGT 402 0 
TTGTAGTCAC TGTCTTTACT TTTCCCCTCC CATTCTGTCC TAGATATTTG TCCACCTACC 4 08 0 
ATCTGCTGCC TCCACTTTAC CCAAACTGTT CTACGGTTGC CCAAAACTTC CTAATTGCCA 414 0 
AATTCAATGA ACAAGTTTAA GCTTATATGT AAATTAGGAG CTCTACAGTT TGATTTCGAG 4 20 0 
CAGCCCCTCC TGAAACCCTT TCTCTTTCGA CTTCTGTGAC ACATCTCAGA TTTACAAAAC 4260 
TGAACTAATT ATTTTACACT TGAGCTGTAT TTTCGTTCTT CTTTCTTGAT GAATGAGGTA 4320 
ACCACTCAAC AAATTGCCCA AG C C AAAAAC TACGAAGTCA TCCTCAGTTC CTCCTTCTTC 43 80 
TGTTTGACCC AC AAC AG AT C AGCTGAGAAA TCCCGCTGTT TAGTATCTCT TGAATTCATT 444 0 
ACCTTAATTT ATAGCCTCAT CAACTCTTAA TTGTTAAAAT TACTTCAGTA GTTGTTGTCT 4 5 00 
GACCTCTGTC CAATCTTGTT CAATCAGGTC CATTCTTTTG TTCTTGGTGG TGGTGGTGGT 4 5 60 
GTTGACAGAG TTTCGCTTTT GCTGCCCAGG C TGAAGTG C A GTGGAGCACT TCACTGCAAC 462 0 
CACAGCCTCC TGGGTTTAAG CAGTTCACCC TCCCGAGTAG CTGGGACTAC AGGTATGTGC 46 8 0 
CACCACACCC AGCTAATTTT GTGTTTTCAG TAGAGACAGG GTTTCACCAT GTTGGTCAGG 4 74 0 
CTGGTCTCAA ACTCCTGACC TCAAGCAATC CACCCACCTC AGCCTCCCAA AGTGCTGGGA 4 8 00 
TTACAGGCAT GAGCCACTGC ACACGGACCA GATCCATTGT TTATGTTGCT TCTAGAGTGA 48 60 
GTTTTTAAAA CACAAATTTG ACCATATCTT TCTCCAATTT AAGTCAGTAT TTTTTTTTTC 4 92 0 
AGGAAAAAAC AGTTCAAACT CTTTAGTCTG CTTACACAAG GCCTTTGTAG TCTGACTCTT 4 98 0 
CTTTCCAAGC TTTCATCAAA GTATACTGCA AGTTACATTT TATGTGAATT GAATTAGGCA 5040 
ACGGTATAAA AATTATAGTT TATATGGGCA AAATG GAAAT AATGTTAACT CTTCCAAATA 5100 
GTTTATCTAG AATGACATAA TTTCAAAGCT GTCAGGTCAA ATGAGTTATA AACTGTTAAC 516 0 
ACTATTGCCA CATGCAAGTG TCTCTTATAC TTGGTAGAAT TATCTGCTTC CATGTCATTA 522 0 
TTATGTAAAT TAGACTTTAA ATAACTCAGA AGTTCTTCAG ACATACAGGT TATTATTGTG 5280 
CTTTTTAAAC ATAATTTTAA ATAATTTTAT AT AT G AT AAT GTTATCCAAG TGCTAAGGGA 534 0 
TGTATTGTTA CTGCTGTGCA AAAAAAAAAA AAAAAAAAAC TCCAAATAAA TATGTTGAAA 54 00 
CCAAGTTTAT ATGCAAGAAA ACAATATTAA AAAGGCCAAA GTACCACCAT AATAGGCTGT 546 0 
GTGGAGACGG CAGGCTACAA AACACTAGTA ATAATGCTGA GAAAGTTGAA AAAAGAAAGA 552 0 
AAGCAACAAT ATGCTTTGGT TGTTGTAGGT TTATGTACTC CAAGAATATC TCCTCTCAAA 5580 
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CTTTTACGTT TTTTCCAAAG AAAAGTTAAC TTTGGCTGGG CGCAGTGGCT CTTGCCTGTA 564 0 
GTCCCAGCCT TTGGGAGGCC AAGGCGGGCA GATCACCTGA GGTCAGGAGT TTGAGACCAG 57 00 
CCTGACCAAA AATGGAGAAA CCCGCCCCCC TCACTACTAA AAGAATACAA AATTAGGCCG 576 0 
GGCACAGTGG CTTACCCCTG TGATCCCAGC ACTTTGGGAG GCCGAAGCAG GAAGATCACC 5820 
TGAGGTCAGG AGTTCGAGAC CAGCCATGGA GAAACCCGTC TCTACTAAAA AT AC AAAAT T 58 8 0 
AGCCGGGCGT GGTGGTGCAT GACTGTAATC CCAGCTACTC AGGAGGCTAA GGCAGAGAAT 594 0 
CACTTGAACC CAGGCAGTGG AGGTTGCAGT GAGCCGAGAT CGTGCCATTG CACTCCAGGC 6 0 00 
TGGGCAACAA GAGCGAAACT CTGTATCCAA AAAACAAAAG AAAAGAAAAG GTAACCTTGA 6 06 0 
ACTATGTGAG ATCTTTAGAA ATGCATTCTT TCTGT AAAAT GTGACTACAT TTGCCTTATT 6120 
TATGGTAAAA ATGTTGAGGC CTCAAACAAC CCATATTTTC TCGGTCTCCC CGCTGCCTAG 618 0 
CCTTTGTTCA CATTGCTTCT TCTTGGTGGA AGCTCTTCCT CTGGCCTTGA AAATGC CTGC 624 0 
TTCTCTTTCA AGGTAGCACA GTCATCACTT TCTGTGGTAA CCTTCTCCAG CAC CATC AAA 630 0 
CAGAAAGAAT GAATCTCTTG TAAATTCAGC TCTTACGTCA TTCATTACAT TATTTTGTAA 636 0 
CTCTTTATAG ATTCTTCTCT CCCACTAGAC TCTGAGTCAC TGGAGAGTAG GAGCCAACTC 6420 
TCATTCATGT GTGGTTTGGT CAGCTACTGG CCACATTCCT GATGCATAGT TAATGCTCAA 648 0 
ACCTTAACTG GTGAATCAGC TCAAATATTG TCCTTCTCTA AATCCATTCA CT CATTG ACT 6 54 0 
AACTATGTAC TCAAAATAGT AAACACCAGT AATTTAATCC AATTCCTGCC CATACTGCTT 66 00 
GGTACATTTC AGGTGAATTA GTTTGATAAA TATGTGTGTA TTACATAATA TTAAAGTATG 6660 
TACAGAAGAT CATGCTAATC ATAATTCACA ACTGATAACT AATCAAACAT AAATGCTCTC 6 72 0 
AGGTTAACAA ATGTCTGCCT TCTCAGTTAA TGCAGTCATT AACAAACACC TTCTGATGCT 67 8 0 
GATAATAGGG CCTTGTTCAG CAATGAAGCC ATAAAGGTGA ATAAAGAACA TGCCCTCGTG 6 840 
GAGCTCACAG CC TAG TC ATT ATTGTTCTGA TTTTTAATAT TAATGTTGGT TTGGGTTTTG 6900 
GTGAAAAATG TTTAGACTTA TCTTAGTGAT CTTTTCATCC TTTGCTATAT TATTTTTCTC 6 960 
TAAGAGTCTT CCTTATCCCC TCCTTTAAAA AACTAGGTGA TAATTCTAAA TTGTAAATTT 7 020 
AAATATTATA AATAGCTTAT AAAATTTAAT ATTTATAATA TTTAAATGTT TGATAAATAT 7 08 0 
TTAAATTTTA TAATATTTAA ATGTTTATTT AAATTCATTT GTACATCAGT TTTTATTTTA 714 0 
TTTAAATGTG TTGGCCAGGC ATGGTGGCTG ACACCTATAA TCCCAGAACT TTGAGAGGCC 7200 
AAGTCAGGCA AAC C ATTTGA GCTCAGGAGT TTGAGACCAC CCTGGGCAAC GTGGTGAAAC 726 0 
CCTGTCTCTA CCAAACATAT GAAAACTTAT CTGGGTGTGG TGGCACGCAT CTGTGGTCCC 7320 
AGATGGGAGT CCCAGGCTAA GATGGGAGAA TCGCTTGAAC CCAGGTGAGA GGGGTGGGGT 738 0 
GGATGTTGCA GTGAGCTGAG ATCGTGCCAC TGCACTCCAA CCTGGGTGAC AGAGTGAGAC 744 0 
TCCATCTCAA AAAAAAAAAA TGTTATCTAA ATAAGATAAA TTTAATAACT GTTCGCACTT 7500 
AGATGAGCAT AAGGAACTAA AC CT AG AT AA AACTATCAAA TAAGGCCTGG GTACAGTGAC 7560 
TCATGCCTGT AATCTCAAGC ACTTTGGGAG GCCAAAATTA TACAAAGTTA GTTGTATAAC 7 620 
ACCAACTAAC AACTATTTTG GGGTTAGCTT AATTCAGATT AATTTTTTTT AAACTGAGTT 7 68 0 
TTAAATTCCT GCTTACTCTA CCATACATGC TAGGCCTCAT ATTATGCTAG AAAAATTTTG 7 74 0 
AG CAC AG AT T TATGAATACT CTCCTGCATA CCATTTAATT TTTAAACAAA TTTTAATGCA 7 8 00 
GTATATATGT GCCTTTTTAC CAACACATTA AATAATAAGA TCTACTGTGA GGACTAAATT 7860 
TCTGTAATTT CAAAGTAGTA ATGAGTTTAA ACCATGTCTC AAGATCTCTG CAATAACTGT 7920 
AGCACAACAG AAAAT AGGTA TTTCTATTAA TGACAGAGTC AC AAGT AC T A CTAATAATAC 798 0 
TGTGGTTTGT TTCCTGCAAC TAATCATGGG AGGAATGCTA AATTTCAGAG GTTGGTGAAA 8 04 0 
ATACATGTGT ATTTTTTTCC C CAT CC AAGT TCACAGATTT CTCACACTGA GAACTCCTAT 810 0 
T C CAT AAC AA AATTCTGGAA GCCTGCACAC CGTATTGGAA GAAGGGCAGA AAGGAAAAGC 8160 
AAATGGAAGG ATTTAAATTT TTTTCAAATC CTGTATCCCT TGATTTTACA GCAAGATTGT 822 0 
ATTTATGTAT TACTTGTGTT AAAAATATAG TATAATCGAG ACT C CAGAT C AAAAATCACC 82 8 0 
GCAGCTCAGG GAGAAAGAGG GCCACCAAAT GCCAGAGCCC TTCAGCCTTC TCCCACCCTG 834 0 
CCTGTACCCT CAGATGGAAG CACTTTTTTA TCATTGTTTC ACCTTTAGCA TTTTGACAAT 8400 
GAAGTCACAA ACCTTCAGCC TCTCACCCAT AGGAACCCAC TGGTTGTAAG AGAAGGATGA 846 0 
AGCCAGTCCT TCCTAAAGGG CACGATTAGA TGTGTTTATG GCATCCTCAG GTGAAACTAT 8 52 0 
AT T TAT AT T G ACAATATATT TATATTTCTC AAGGAATACT AGAATAATGA TTCAGTTCAG 858 0 
TACTAGGCCA TTTATCTACC CTTTATAATA TTGTTTAATG AGAAAATGCT TTCTATCTTC 8 64 0 
CAAATATCTG ATGATTTGTA AG AG AAC AC T TAAACATGGG TATTCATAAG CTGAAACTTC 870 0 
TGGCATTTAT TGAATGTCAA GATTGTTCAT CAGTATACTA GGTGATTAAC TG AC CAC TG A 8 760 
ACTTGAAGGT AGTATAAAGT AGTAGTAAAA GGTACAATCA TTGTCTCTTA AC AG AT GG CT 8 820 
CTTTGCTTTC ATT AG 8 83 5 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1371 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM; human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME / KEY : intron 

(B) LOCATION: 1..1371 

(C) IDENTIFICATION METHODS: E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 11: 

GTAAGGCTAA TGCCATAGAA CAAATACCAG GTTCAGATAA ATCTATTCAA TTAGAAAAGA 60 

TGTTGTGAGG TGAACTATTA AGTGACTCTT TGTGTCACCA AATTTCACTG TAATATTAAT 120 

GGCTCTTAAA AAAATAGTGG ACCTCTAGAA ATTAACCACA ACATGTCCAA GGTCTCAGCA 180 

CCTTGTCACA CCACGTGTCC TGGCACTTTA ATCAGCAGTA GCTCACTCTC CAGTTGGCAG 240 

TAAGTGCACA TCATGAAAAT CCCAGTTTTC ATGGG AAAAT CCCAGTTTTC ATTGGATTTC 3 00 

CATGGGAAAA AT C CC AG T AC AAAACTGGGT GCATTCAGGA AATACAATTT CCCAAAGCAA 360 

ATTGGCAAAT TATGTAAGAG ATTCTCTAAA TTTAGAGTTC CGTGAATTAC AC CATTTT AT 42 0 

GTAAATATGT TTGACAAGTA AAAATTGATT CTTTTTTTTT TTTTCTGTTG CCCAGGCTGG 48 0 

AGTGCAGTGG CACAATCTCT GCTCACTGCA ACCTCCACCT CCTGGGTTCA AGCAATTCTC 540 

CTGCCTCAGC CTTCTGAGTA GCTGGGACTA CAGGTGCATC CCGCCATGCC TGGCTAATTT 6 00 

TTGGGTATTT TTACTAGAGA CAGGGTTTTG GCATGTTGTC CAGGCTGGTC TTGGACTCCT 66 0 

GATCTCAGAT GATCCTCCTG GCTCGGGCTC CCAAAGTGCT GGGATTACAG GCATGAACCA 720 

CCACACATGG CCTAAAAATT GATTCTTATG ATTAATCTCC TGTGAACAAT TTGGCTTCAT 78 0 

TTGAAAGTTT GCCTTCATTT GAAACCTTCA TTTAAAAGCC TGAGCAACAA AGTGAGACCC 84 0 

CATCTCTACA AAAAACTGCA AAATATCCTG TGGACACCTC CTACCTTCTG TGGAGGCTGA 900 

AGCAGGAGGA TCACTTGAGC CTAGGAATTT GAGCCTGCAG TGAGCTATGA TCCCACCCCT 960 

ACACTCCAGC CTGCATGACA GTAGACCCTG ACACACACAC ACAAAAAAAA ACCTTCATAA 102 0 

AAAATTATTA GTTGACTTTT CTTAGGTGAC TTTCCGTTTA AGCAATAAAT TTAAAAGTAA 1080 

AATCTCTAAT TTTAGAAAAT TTATTTTTAG TTACATATTG AAATTTTTAA ACCCTAGGTT 1140 

TAAGTTTTAT GTCTAAATTA CCTGAGAACA CACTAAGTCT GATAAGCTTC ATTTTATGGG 1200 

CCTTTTGGAT GATTATATAA TATTCTGATG AAAGCCAAGA CAGACCCTTA AACCATAAAA 1260 

ATAGGAGTTC GAGAAAGAGG AGTAGCAAAA GTAAAAGCTA GAATGAGATT GAATTCTGAG 1320 

T C GAAAT AC A AAATTTTACA TATTCTGTTT CTCTCTTTTT CCCCCTCTTA G 13 71 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 83 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME /KEY : intron 

(B) LOCATION: 1..3383 

(C) IDENTIFICATION METHODS: E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GTAAAGTAGA AATGAATTTA TTTTTCTTTG CAAACTAAGT ATCTGCTTGA GACACATCTA 6 0 

TCTCACCATT GTCAGCTGAG GAAAAAAAAA AATGGTTCTC ATGCTACCAA TCTGCCTTCA 12 0 

AAGAAATGTG GACTCAGTAG CACAGCTTTG GAATGAAGAT GATCATAAGA GATACAAAGA 18 0 

AGAACCTCTA GCAAAAGATG CTTCTCTATG CCTTAAAAAA TTCTCCAGCT CTTAGAATCT 24 0 

ACAAAATAGA CTTTGCCTGT TTCATTGGTC CTAAGATTAG CATGAAGC C A TGGATTCTGT 3 00 

TGTAGGGGGA GCGTTGCATA GGAAAAAGGG ATTGAAGCAT TAGAATTGTC CAAAATCAGT 36 0 

AACACCTCCT CTCAGAAATG CTTTGGGAAG AAGCCTGGAA GGTTCCGGGT TGGTGGTGGG 420 

GTGGGGCAGA AAATTCTGGA AGTAGAGGAG ATAGGAATGG GTGGGGCAAG AAGACCACAT 480 

TCAGAGGCCA AAAGCTGAAA GAAACCATGG CATTTATGAT GAATTCAGGG TAATTCAGAA 540 

TGGAAGTAGA GTAGGAGTAG GAGACTGGTG AGAGGAGCTA GAGTGATAAA CAGGGTGTAG 6 00 

AGCAAGACGT TCTCTCACCC CAAGATGTGA AATTTGGACT TTATCTTGGA GATAATAGGG 660 

TTAATTAAGC ACAATATGTA TTAGCTAGGG TAAAGATTAG TTTGTTGTAA CAAAGACATC 72 0 
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CAAAGATACA GTAGCTGAAT AAGATAGAGA ATTTTTCTCT CAAAGAAAGT CTAAGTAGGC 78 0 
AGCTCAGAAG TAGTATGGCT GGAAGCAACC TG AT GAT ATT GGGACCCCCA ACCTTCTTCA 84 0 
GTCTTGTACC CATCATCCCC TAGTTGTTGA TCTCACTCAC ATAGTTGAAA AT CAT CAT AC 900 
TTCCTGGGTT CATATCCCAG TTATCAAGAA AGGGTCAAGA GAAGTCAGGC TCATTCCTTT 96 0 
CAAAGACTCT AATTGGAAGT TAAACACATC AATCCCCCTC ATATTCCATT G AC T AG AATT 1020 
TAATCACATG GCCACACCAA GTGCAAGGAA ATCTGGAAAA TATAATCTTT ATTCCAGGTA 10 80 
GCCATATGAC TCTTTAAAAT TCAGAAATAA TATATTTTTA AAATATCATT CTGGCTTTGG 1140 
TATAAAGAAT TGATGGTGTG GGGTGAGGAG GCCAAAATTA AGGGTTGAGA GCCTATTATT 12 00 
TTAGTTATTA CAAGAAATGA TGGTGTCATG AATTAAGGTA GACATAGGGG AGTGCTGATG 12 6 0 
AGGAGCTGTG AATGGATTTT AGAAACACTT GAGAGAATCA ATAGGACATG ATTTAGGGTT 1320 
GGATTTGGAA AGGAGAAGAA AGTAGAAAAG ATGATGCCTA CATTTTTCAC TTAGGCAATT 1380 
TGTACCATTC AGTGAAATAG GGAACACAGG AGGAAGAGCA GGTTTTGGTG TATACAAAGA 144 0 
GGAGGATGGA TGACGCATTT CGTTTTGGAT CTGAGATGTC TGTGGAACGT CCTAGTGGAG 1500 
ATGTCCACAA ACTCTTCTAC ATGTGGTTCT GAGTTCAGGA CACAGATTTG GGCTGGAGAT 1560 
AGAGATATTG TAGGCTTATA CATAGAAATG GCATTTGAAT CTATAGAGAT AAAAAGACAC 16 2 0 
ATCAGAGGAA ATGTGTAAAG TGAGAGAGGA AAAGCCAAGT ACTGTGCTGG GGGGAATACC 1680 
TACATTTAAA GGATGCAGTA GAAAGAAGCT AATAAACAAC AGAGAGCAGA CTAACCAAAA 174 0 
GGGGAGAAGA AAAACCAAGA GAATTCCACC GACTCCCAGG AGAGCATTTC AAGATTGAGG 18 00 
GGATAGGTGT TGTGTTGAAT TTTGCAGCCT TGAGAATCAA GGGCCAGAAC ACAGCTTTTA 1860 
GATTTAGCAA CAAGGAGTTT GGTGATCTCA GTGAAAGCAG CTTGATGGTG AAATGGAGGC 1920 
AGAGGCAGAT TGCAATGAGT GAAACAGTGA ATGGGAAGTG AAGAAATGAT ACAGATAATT 198 0 
CTTGCTAAAA GCTTGGCTGT TAAAAGGAGG AGAGAAACAA GACTAGCTGC AAAGTGAGAT 2 04 0 
TGGGTTGATG GAGCAGTTTT AAATCTCAAA ATAAAGAGCT TTGTGCTTTT TTGATTATGA 2100 
AAATAATGTG TTAATTGTAA CTAATTGAGG CAATGAAAAA AGATAATAAT ATGAAAGATA 2160 
AAAATATAAA AACCACCCAG AAATAATGAT AGCTACCATT TTGATACAAT ATTTCTACAC 222 0 
TCCTTTCTAT GTATATATAC AGACACAGAA ATGCTTATAT TTTTATTAAA AGGGATTGTA 22 8 0 
CTATACCTAA GCTGCTTTTT CTAGTTAGTG ATATATATGG ACATCTCTCC ATGGCAACGA 2340 
GTAATTGCAG TTATATTAAG TTCATGATAT TTCACAATAA GGGCATATCT TTGCCCTTTT 2400 
TATTTAATCA ATTCTTAATT GGTGAATGTT TGTTTCCAGT TTGTTGTTGT TATTAACAAT 2460 
GTTCCCATAA GCATTCCTGT ACACCAATGT TCACACATTT GTCTGATTTT TTCTTCAGGA 2520 
TAAAACCCAG GAGGTAGAAT TGCTGGGTTG ATAGAAGAGA AAGGATGATT GCCAAATTAA 25 8 0 
AGCTTCAGTA GAGGGTACAT GCCGAGCACA AATGGGATCA G C C C TAG AT A CCAGAAATGG 2640 
CACTTTCTCA TTTCCCCTTG GGACAAAAGG GAGAGAGGCA ATAACTGTGC TGCCAGAGTT 2 700 
AAATTTGTAC GTGGAGTAGC AGGAAATCAT TTGCTGAAAA TGAAAACAGA GATGATGTTG 2 76 0 
TAGAGGTCCT G AAGAG AG C A AAGAAAATTT GAAATTGCGG CTATCAGCTA TGGAAGAGAG 282 0 
TGCTGAACTG GAAAACAAAA GAAGTATTGA CAATTGGTAT GCTTGTAATG GC AC CGATTT 288 0 
GAACGCTTGT GCCATTGTTC ACCAGCAGCA CTCAGCAGCC AAGTTTGGAG TTTTGTAGCA 2 940 
GAAAGACAAA TAAGTTAGGG ATTTAATATC CTGGCCAAAT GGTAGACAAA ATGAACTCTG 3 000 
AGATCCAGCT GCACAGGGAA GGAAGGGAAG ACGGGAAGAG GTTAGATAGG AAATACAAGA 3 06 0 
GTCAGGAGAC TGGAAGATGT TGTGATATTT AAGAACACAT AGAGTTGGAG TAAAAGTGTA 312 0 
AGAAAACTAG AAGGGTAAGA GACCGGTCAG AAAGTAGGCT ATTTGAAGTT AACACTTCAG 318 0 
AGGCAGAGTA GTTCTGAATG GTAACAAGAA ATTGAGTGTG CCTTTGAGAG TAGGTTAAAA 3240 
AACAATAGGC AACTTTATTG TAGCTACTTC TGGAACAGAA GATTGTCATT AATAGTTTTA 3 3 00 
GAAAACTAAA ATATATAGCA TACTTATTTG TCAATTAACA AAGAAACTAT GTATTTTTAA 3 360 
AT GAG AT T T A ATGTTTATTG TAG 33 83 

(2) INFORMATION FOR SEQ ID NO; 13: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 11464 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: Genomic DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME / KEY : 5 1 UTR 

(B) LOCATION: 1. .3 

(C) IDENTIFICATION METHODS : E 

(A) NAME / KEY : leader peptide 

(B) LOCATION: 4 . . 82 

(C) IDENTIFICATION METHODS : S 
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(A) NAME /KEY : intron 

(B) LOCATION: 83.. 1453 

(C) IDENTIFICATION METHODS: E 

(A) NAME / KEY : leader peptide 

(B) LOCATION: 1454.. 1465 

(C) IDENTIFICATION METHODS : S 

(A) NAME /KEY : intron 

(B) LOCATION: 1466.. 4848 

(C) IDENTIFICATION METHODS: E 

(A) NAME / KEY : leader peptide 

(B) LOCATION: 4849.. 4865 

(C) IDENTIFICATION METHODS: S 

(A) NAME/KEY: mat peptide 

(B) LOCATION: 4866.. 4983 

(C) IDENTIFICATION METHODS: S 

(A) NAME /KEY : intron 

(B) LOCATION: 4984.. 6317 

(C) IDENTIFICATION METHODS : E 

(A) NAME / KEY : mat peptide 

(B) LOCATION: 6318.. 6451 

(C) IDENTIFICATION METHODS: S 

(A) NAME / KEY : intron 

(B) LOCATION: 6452.. 11224 

(C) IDENTIFICATION METHODS: E 

(A) NAME /KEY : mat peptide 

(B) LOCATION: 11225.. 11443 

(C) IDENTIFICATION METHODS: S 

(A) NAME / KEY : 3 1 UTR 

(B) LOCATION: 11444.. 11464 

(C) IDENTIFICATION METHODS : E 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

AAG ATG GCT GCT GAA CCA GTA GAA GAC AAT TGC ATC AAC TTT GTG GCA 4 8 
Met Ala Ala Glu Pro Val Glu Asp Asn Cys lie Asn Phe Val Ala 
-35 -30 -25 

ATG AAA TTT ATT GAC AAT ACG CTT TAC TTT ATA G GTAAGG CTAATGCCAT 98 
Met Lys Phe lie Asp Asn Thr Leu Tyr Phe lie Ala 
-20 -15 -10 

AGAACAAATA CCAGGTTCAG ATAAATCTAT TCAATTAGAA AAGATGTTGT GAGGTGAACT 158 

ATTAAGTGAC TCTTTGTGTC ACCAAATTTC ACTGTAATAT TAATGGCTCT TAAAAAAATA 218 

GTGGACCTCT AGAAATTAAC CACAACATGT CCAAGGTCTC AGCACCTTGT CACACCACGT 27 8 

GTCCTGGCAC TTTAATCAGC AGTAGCTCAC TCTCCAGTTG GCAGTAAGTG C AC AT CAT G A 33 8 

AAATCCCAGT TTTCATGGGA AAATCCCAGT TTTCATTGGA TTTCCATGGG AAAAATCCCA 398 

GTACAAAACT GGGTGCATTC AGGAAATACA ATTTCCCAAA GCAAATTGGC AAATTATGTA 4 58 

AGAGATTCTC TAAATTTAGA GTTCCGTGAA TTACACCATT TTATGTAAAT ATGTTTGACA 518 

AGTAAAAATT GATTCTTTTT TTTTTTTTCT GTTGCCCAGG CTGGAGTGCA GTGGCACAAT 578 

CTCTGCTCAC TGCAACCTCC ACCTCCTGGG TTCAAGCAAT TCTCCTGCCT CAGCCTTCTG 638 

AGTAGCTGGG ACTACAGGTG CATCCCGCCA TGCCTGGCTA ATTTTTGGGT ATTTTTACTA 6 98 

GAGACAGGGT TTTGGCATGT TGTCCAGGCT GGTCTTGGAC TCCTGATCTC AGATGATCCT 758 

CCTGGCTCGG GCTCCCAAAG TGCTGGGATT ACAGGCATGA ACCACCACAC ATGGCCTAAA 818 

AATTGATTCT TATGATTAAT CTCCTGTGAA CAATTTGGCT TCATTTGAAA GTTTGCCTTC 8 78 

ATTTGAAACC TTCATTTAAA AGCCTGAGCA AC AAAG T GAG ACCCCATCTC TACAAAAAAC 93 8 

TGCAAAATAT CCTGTGGACA CCTCCTACCT TCTGTGGAGG CTGAAGCAGG AGGATCACTT 998 

GAGCCTAGGA ATTTGAGCCT GCAGTGAGCT ATGATCCCAC CCCTACACTC CAGCCTGCAT 10 5 8 

GACAGTAGAC CCTGACACAC ACACACAAAA AAAAACCTTC ATAAAAAATT ATTAGTTGAC 1118 

TTTTCTTAGG TGACTTTCCG TTTAAGCAAT AAATTTAAAA GTAAAATCTC TAATTTTAGA 1178 

AAATTTATTT TTAGTTACAT ATTGAAATTT TTAAACCCTA GGTTTAAGTT TTATGTCTAA 1238 

ATTACCTGAG AACACACTAA GTCTGATAAG CTTCATTTTA TGGGCCTTTT G G AT GAT TAT 12 9 8 

AT AAT AT T C T GAT G AAAG C C AAGACAGACC CTTAAACCAT AAAAATAGGA GTTCGAGAAA 13 58 

GAGGAGTAGC AAAAGTAAAA GCTAGAATGA GATTGAATTC TGAGTCGAAA TACAAAATTT 1418 

TAC AT ATT C T GTTTCTCTCT TTTTCCCCCT CTTAG CT GAA GAT GAT G GTAAA 147 0 

Ala Glu Asp Asp Glu 
-10 

GTAGAAATGA ATTTATTTTT CTTTGCAAAC TAAGTATCTG CTTGAGACAC ATCTATCTCA 1530 

CCATTGTCAG CTGAGGAAAA AAAAAAATGG TTCTCATGCT ACCAATCTGC CTTCAAAGAA 1590 

ATGTGGACTC AGTAGCACAG CTTTGGAATG AAGATGATCA TAAGAGATAC AAAGAAGAAC 1650 
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CTCTAGCAAA AGATGCTTCT CTATGCCTTA AAAAATTCTC CAGCTCTTAG AATCTACAAA 1710 

ATAGACTTTG CCTGTTTCAT TGGTCCTAAG ATTAGCATGA AGCCATGGAT TCTGTTGTAG 17 70 

GGGGAGCGTT GCATAGGAAA AAGGGATTGA AGCATTAGAA TTGTCCAAAA TCAGTAACAC 18 3 0 

CTCCTCTCAG AAATGCTTTG GGAAGAAGCC TGGAAGGTTC CGGGTTGGTG GTGGGGTGGG 18 90 

GCAGAAAATT CTGGAAGTAG AGGAGATAGG AATGGGTGGG GCAAGAAGAC C AC ATT C AG A 19 50 

GGCCAAAAGC TGAAAGAAAC CATGGCATTT ATGATGAATT CAGGGTAATT CAGAATGGAA 2 010 

GTAGAGTAGG AGTAGGAGAC TGGTGAGAGG AGCTAGAGTG ATAAACAGGG TGTAGAGCAA 2070 

GACGTTCTCT CACCCCAAGA TGTGAAATTT GGACTTTATC TTGGAGATAA TAGGGTTAAT 2130 

TAAGCACAAT ATGTATTAGC TAGGGTAAAG ATTAGTTTGT TGTAACAAAG ACATCCAAAG 2190 

AT ACAGT AG C TGAATAAGAT AGAGAATTTT TCTCTCAAAG AAAGTCTAAG TAGGCAGCTC 22 5 0 

AGAAGTAGTA TGGCTGGAAG CAACCTGATG ATATTGGGAC CCCCAACCTT CTTCAGTCTT 2310 

GTACCCATCA TCCCCTAGTT GTTGATCTCA CTCACATAGT TGAAAATCAT CATACTTCCT 23 70 

GGGTTCATAT CCCAGTTATC AAGAAAGGGT CAAGAGAAGT CAGGCTCATT CCTTTCAAAG 243 0 

ACTCTAATTG GAAGTTAAAC ACATCAATCC CCCTCATATT CCATTGACTA GAATTTAATC 2490 

ACATGGCCAC ACCAAGTGCA AGGAAATCTG GAAAATATAA TCTTTATTCC AGGTAGCCAT 2550 

ATGACTCTTT AAAATTCAGA AATAATATAT TTTTAAAATA TCATTCTGGC TTTGGTATAA 2610 

AGAATTGATG GTGTGGGGTG AGGAGGCCAA AATTAAGGGT TGAGAGCCTA TTATTTTAGT 267 0 

TATTACAAGA AATGATGGTG TCATGAATTA AGGTAGACAT AGGGGAGTGC TGATGAGGAG 273 0 

CTGTGAATGG ATTTTAGAAA CACTTGAGAG AATCAATAGG ACATGATTTA GGGTTGGATT 279 0 

TGGAAAGGAG AAGAAAGTAG AAAAGATGAT GCCTACATTT TTCACTTAGG CAATTTGTAC 2 85 0 

CATTCAGTGA AATAGGGAAC ACAGGAGGAA GAGCAGGTTT TGGTGTATAC AAAGAGGAGG 2910 

ATGGATGACG CATTTCGTTT TGGATCTGAG ATGTCTGTGG AACGTCCTAG TGGAGATGTC 2 97 0 

CACAAACTCT TCTACATGTG GTTCTGAGTT CAGGACACAG ATTTGGGCTG GAGATAGAGA 3 03 0 

TATTGTAGGC TTATACATAG AAATGGCATT TGAATCTATA GAGATAAAAA GACACATCAG 3090 

AGGAAATGTG TAAAGTGAGA GAGGAAAAGC CAAGTACTGT GCTGGGGGGA ATACCTACAT 315 0 

TTAAAGGATG CAGTAGAAAG AAGCTAATAA ACAACAGAGA G C AG AC T AAC CAAAAGGGGA 3210 

GAAGAAAAAC CAAGAGAATT CCACCGACTC C C AGG AG AG C ATTTCAAGAT TGAGGGGATA 327 0 

GGTGTTGTGT TGAATTTTGC AGCCTTGAGA ATCAAGGGCC AGAACACAGC TTTTAGATTT 333 0 

AGCAACAAGG AGTTTGGTGA TCTCAGTGAA AGCAGCTTGA TGGTGAAATG GAGG C AG AGG 339 0 

CAGATTGCAA TGAGTGAAAC AGTGAATGGG AAGTGAAGAA ATGATACAGA TAATTCTTGC 345 0 

TAAAAGCTTG GCTGTTAAAA GGAGGAGAGA AACAAGACTA GCTGCAAAGT GAGATTGGGT 3 510 

TGATGGAGCA GTTTTAAATC TCAAAATAAA GAGCTTTGTG CTTTTTTGAT TATGAAAATA 3 57 0 

ATGTGTTAAT TGTAACTAAT TGAGGCAATG AAAAAAGATA ATAATATGAA AGATAAAAAT 363 0 

AT AAAAAC C A CCCAGAAATA ATGATAGCTA CCATTTTGAT ACAATATTTC TACACTCCTT 3 690 

TCTATGTATA TATACAGACA CAGAAATGCT TATATTTTTA TTAAAAGGGA TTGTACTATA 3 7 50 

CCTAAGCTGC TTTTTCTAGT TAGTGATATA T AT GG AC AT C TCTCCATGGC AACGAGTAAT 3 810 

TGCAGTTATA TTAAGTTCAT GATATTTCAC AATAAGGGCA TATCTTTGCC CTTTTTATTT 3 87 0 

AATCAATTCT TAATTGGTGA ATGTTTGTTT CCAGTTTGTT GTTGTTATTA ACAATGTTCC 393 0 

CATAAGCATT CCTGTACACC AATGTTCACA CATTTGTCTG ATTTTTTCTT CAGGATAAAA 3 990 

CCCAGGAGGT AGAATTGCTG GGTTGATAGA AGAGAAAGGA TGATTGCCAA ATTAAAGCTT 4 050 

CAGTAGAGGG TACATGCCGA GC AC AAATGG GATCAGCCCT AGATAC C AG A AATGGCACTT 4110 

TCTCATTTCC CCTTGGGACA AAAGGGAGAG AGGCAATAAC TGTGCTGCCA GAGTTAAATT 4170 

TGTACGTGGA GTAGCAGGAA ATCATTTGCT GAAAATGAAA ACAGAGATGA TGTTGTAGAG 423 0 

GTCCTGAAGA GAGCAAAGAA AATTTGAAAT TGCGGCTATC AGCTATGGAA GAGAGTGCTG 42 90 

AACTGGAAAA CAAAAGAAGT ATTGACAATT GGTATGCTTG TAATGGCACC GATTTGAACG 43 5 0 

CTTGTG C CAT TGTTCACCAG CAGCACTCAG CAGCC AAGTT TGGAGTTTTG TAG C AG AAAG 4410 

ACAAATAAGT TAGGGATTTA ATATCCTGGC CAAATGGTAG ACAAAATGAA CTCTGAGATC 4470 

CAGCTGCACA GGGAAGGAAG GGAAGACGGG AAGAGGTTAG ATAGGAAATA CAAGAGTCAG 4 53 0 

GAGACTGGAA GATGTTGTGA TATTTAAGAA CACATAGAGT TGGAGTAAAA GTGTAAGAAA 4 590 

ACTAGAAGGG TAAGAGACCG GTCAGAAAGT AGGCTATTTG AAGTTAACAC TTCAGAGGCA 4650 

GAGTAGTTCT GAATGGTAAC AAGAAATTGA GTGTGCCTTT GAGAGTAGGT TAAAAAACAA 4 710 

TAGGCAACTT TATTGTAGCT ACTTCTGGAA CAGAAGATTG TCATTAATAG TTTTAGAAAA 4 77 0 

C T AAAAT AT A TAGCATACTT ATTTGTCAAT TAACAAAGAA ACTATGTATT TTTAAATGAG 483 0 

ATTTAATGTT TATTGTAG AA AAC CTG GAA TCA GAT TAC TTT GGC AAG CTT 4 880 

Glu Asn Leu Glu Ser Asp Tyr Phe Gly Lys Leu 
-5 15 

GAA TCT AAA TTA TCA GTC ATA AGA AAT TTG AAT GAC CAA GTT CTC TTC 4928 
Glu Ser Lys Leu Ser Val He Arg Asn Leu Asn Asp Gin Val Leu Phe 

10 15 20 

ATT GAC CAA GGA AAT CGG CCT CTA TTT GAA GAT ATG ACT GAT TCT GAC 4 9 76 
He Asp Gin Gly Asn Arg Pro Leu Phe Glu Asp Met Thr Asp Ser Asp 

25 30 35 

TGT AGA G GTATTTTTT TTAATTCGCA AACATAGAAA TGACTAGCTA CTTCTTCCCA 5032 
Cys Arg Asp 
40 

TTCTGTTTTA CTG CTT AC AT TGTTCCGTGC TAGTCCCAAT CCTCAGATGA AAAGTCACAG 5 092 

GAGTGACAAT AATTTCACTT ACAGGAAACT TTATAAGGCA TCCACGTTTT TTAGTTGGGG 5152 
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TAAAAAATTG GATACAATAA GACATTGCTA GGGGTCATGC CTCTCTGAGC CTGCCTTTGA 5212 

ATCACCAATC CCTTTATTGT GATTGCATTA ACTGTTTAAA ACCTCTATAG TTGGATGGTT 52 72 

AATCCCTGCT TGTTACAGCT GAAAATGCTG ATAGTTTACC AGGTGTGGTG GCATCTATCT 5332 

GTAATCCTAG CTACTTGGGA GGCTCAAGCA GGAGGATTGC TTGAGGCCAG GACTTTGAGG 53 92 

CTGTAGTACA CTGTGATCGT ACCTGTGAAT AGCCACTGCA CTCCAGCCTG GGTGATATAC 54 52 

AGACCTTGTC TCTAAAATTA AAAAAAAAAA AAAAAAAAAC CTTAGGAAAG GAAATTGATC 5 512 

AAGTCTACTG TGCCTTCCAA AACATGAATT CCAAATATCA AAGTTAGGCT GAGTTGAAGC 5572 

AGTGAATGTG CATTCTTTAA AAATACTGAA TACTTACCTT AACATATATT TTAAATATTT 5632 

TATTTAGCAT TTAAAAGTTA AAAACAATCT TTTAGAATTC ATATCTTTAA AATACTCAAA 5 6 92 

AAAGTTGCAG CGTGTGTGTT GTAATACACA TTAAACTGTG GGGTTGTTTG TTTGTTTGAG 5752 

ATGCAGTTTC ACTCTGTCAC CCAGGCTGAA GTGCAGTGCA GTGCAGTGGT GTGATCTCGG 5812 

CTCACTACAA CCTCCACCTC CCACGTTCAA GCGATTCTCA TGCCTCAGTC TCCCGAGTAG 5872 

GTGGGATTAC AGGCATGCAC CACTTACACC CGGCTAATTT TTGTATTTTT AGTAGAGCTG 5 93 2 

GGGTTTCACC ATGTTGGCCA GGCTGGTCTC AAACCCCTAA CCTCAAGTGA TCTGCCTGCC 5992 

TCAGCCTCCC AAACAAACAA ACAACCCCAC AGTTTAATAT GTGTTACAAC ACACATGCTG 60 52 

CAACTTTTAT GAGTATTTTA ATGATATAGA TTATAAAAGG TTGTTTTTAA CTTTTAAATG 6112 

CTGGGATTAC AGGCATGAGC CACTGTGCCA GGCCTGAACT GTGTTTTTAA AAATGTCTGA 6172 

CCAGCTGTAC ATAGTCTCCT GCAGACTGGC CAAGTCTCAA AGTGGGAACA GGTGTATTAA 6232 

GGACTATCCT TTGGTTAAAT TTCCGCAAAT GTTCCTGTGC AAGAATTCTT CTAACTAGAG 62 92 

TTCTCATTTA TTATATTTAT TTCAG AT AAT GCA CCC CGG ACC ATA TTT ATT 63 43 

Asp Asn Ala Pro Arg Thr He Phe He 

40^ 45 

ATA AGT ATG TAT AAA GAT AGC CAG CCT AGA GGT ATG GCT GTA ACT ATC 6 3 91 
He Ser Met Tyr Lys Asp Ser Gin Pro Arg Gly Met Ala Val Thr He 

50 55 60 

TCT GTG AAG TGT GAG AAA ATT TCA ACT CTC TCC TGT GAG AAC AAA ATT 64 3 9 
Ser Val Lys Cys Glu Lys He Ser Thr Leu Ser Cys Glu Asn Lys He 
65 70 75 80 

ATT TCC TTT AAG GTAAG ACTGAGCCTT ACTTTGTTTT CAATCATGTT AAT AT AAT C A 6 4 96 
He Ser Phe Lys 

ATATAATTAG AAATATAACA TTATTTCTAA TGTTAATATA AGTAATGTAA TTAGAAAACT 6556 

CAAATATCCT CAGACCAACC TTTTGTCTAG AACAGAAATA ACAAGAAGCA GAGAACCATT 6 616 

AAAGTGAATA CTTACTAAAA ATTATCAAAC TCTTTACCTA T TGT GAT AAT GATGGTTTTT 6 6 76 

CTGAGCCTGT CACAGGGGAA GAGGAGATAC AACACTTGTT TTATGACCTG CATCTCCTGA 673 6 

ACAATCAGTC TTTATACAAA TAATAATGTA GAATACATAT GTGAGTTATA CATTTAAGAA 67 96 

TAACATGTGA CTTTCCAGAA TGAGTTCTGC TATGAAGAAT GAAG C T AAT T ATCCTTCTAT 6 8 56 

ATTTCTACAC CTTTGTAAAT TAT GAT AAT A TTTTAATCCC TAGTTGTTTT GTTGCTGATC 6 916 

CTTAGCCTAA GTCTTAGACA CAAGCTTCAG CTTCCAGTTG ATGTATGTTA TTTTTAATGT 6 976 

TAATCTAATT GAATAAAAGT TAT GAG AT C A GCTGTAAAAG TAATGCTATA ATTATCTTCA 70 3 6 

AG C C AGGT AT AAAGTATTTC TGGCCTCTAC TTTTTCTCTA TTATTCTCCA TTATTATTCT 7096 

CTATTATTTT TCTCTATTTC CTCCATTATT GTTAGATAAA CCACAATTAA CTATAGCTAC 7156 

AGACTGAGCC AG T AAG AGT A GCCAGGGATG CTTACAAATT GGC AATGCTT C AG AG G AG AA 7216 

TTCCATGTCA TGAAGACTCT TTTTGAGTGG AGATTTGCCA AT AAAT AT C C GCTTTCATGC 7276 

CCACCCAGTC CCCACTGAAA GACAGTTAGG ATATGACCTT AGTGAAGGTA CCAAGGGGCA 733 6 

ACTTGGTAGG GAGAAAAAAG CCACTCTAAA ATATAATCCA AGTAAGAACA GTGCATATGC 73 96 

AACAGATACA GCCCCCAGAC AAATCCCTCA GCTATCTCCC TCC AAC CAG A GTGCCACCCC 74 56 

TTCAGGTGAC AATTTGGAGT CCCCATTCTA GACCTGACAG GCAGCTTAGT TATCAAAATA 7516 

GCATAAGAGG CCTGGGATGG AAGGGTAGGG TGGAAAGGGT TAAGCATGCT GTTACTGAAC 757 6 

AACATAATTA GAAGGGAAGG AGATGGCCAA GCTCAAGCTA TGTGGGATAG AGGAAAACTC 763 6 

AGCTGCAGAG GCAGATTCAG AAACTGGGAT AAGTCCGAAC CTACAGGTGG ATTCTTGTTG 76 96 

AGGGAGACTG GTGAAAATGT TAAGAAGATG GAAATAATGC TTGGCACTTA GTAGGAACTG 7 7 56 

GGCAAATCCA TATTTGGGGG AGCCTGAAGT T TAT T C AAT T TTGATGGCCC TTTTAAATAA 7816 

AAAGAATGTG GCTGGGCGTG GTGGCTCACA CCTGTAATCC CAGCACTTTG GGAGGCCGAG 7876 

GGGGGCGGAT CACCTGAAGT CAGGAGTTCA AGACCAGCCT G AC C AAC ATG GAGAAACCCC 7 936 

ATCTCTACTA AAAATACAAA ATTAGCTGGG CGTGGTGGCA TATGCCTGTA ATCCCAGCTA 7 996 

CTCGGGAGGC TGAGGCAGGA GAATCTTTTG AACCCGGGAG GCAGAGGTTG CGATGAGCCT 8 056 

AGATCGTGCC ATTGCACTCC AGCCTGGGCA ACAAGAGCAA AACTCGGTCT CAAAAAAAAA 8116 

AAAAAAAAAG TGAAAT T AAC CAAAGGCATT AGCTTAATAA TTTAATACTG TTTTTAAGTA 8176 

GGGCGGGGGG TGGCTGGAAG AGATCTGTGT AAATGAGGGA ATCTGACATT T AAG C T T CAT 823 6 

CAGCATCATA GCAAATCTGC TTCTGGAAGG AACTCAATAA ATATTAGTTG GAGGGGGGGA 82 96 

GAGAGTGAGG GGTGGACTAG G AC C AGTTTT AGCCCTTGTC TTTAATCCCT TTTCCTGCCA 83 56 

CTAATAAGGA TCTTAGCAGT GGTTATAAAA GTGGCCTAGG TTCTAGATAA TAAGATACAA 8416 

CAGGCCAGGC ACAGTGGCTC ATG CCT AT AA TCCCAGCACT TTGGGAGGGC AAGGCGAGTG 8476 

TCTCACTTGA GATCAGGAGT TCAAGACCAG CCTGGCCAGC ATGGCGATAC TCTGTCTCTA 853 6 

CTAAAAAAAA TACAAAAATT AGCCAGGCAT GGTGGCATGC AC C TGT AAT C CCAGCTACTC 85 96 

GTGAGCCTGA GGCAGAAGAA TCGCTTGAAA CCAGGAGGTG TAGGCTGCAG TGAGCTGAGA 8656 

TCGCACCACT GCACTCCAGC CTGGGCGACA GAATGAGACT TTGTCTCAAA AAAAGAAAAA 8716 
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GATACAACAG GCTACCCTTA TGTGCTCACC TTTCACTGTT GATTACTAGC TATAAAGTCC 8 7 76 

TATAAAGTTC TTTGGTCAAG AACCTTGACA AC AC T AAG AG GGATTTGCTT TGAGAGGTTA 8836 

CTGTCAGAGT CTGTTTCATA TATATACATA TACATGTATA TATGTATCTA TATCCAGGCT 88 96 

TGGCCAGGGT TCCCTCAGAC TTTCCAGTGC ACTTGGGAGA TGTTAGGTCA ATATCAACTT 8956 

TCCCTGGATT CAGATTCAAC CCCTTCTGAT GTAAAAAAAA AAAAAAAAAA GAAAGAAATC 9016 

CCTTTCCCCT TGGAGCACTC AAGTTTCACC AGGTGGGGCT TTCCAAGTTG GGGGTTCTCC 9076 

AAGGTCATTG GGATTGCTTT CACATCCATT TGCTATGTAC CTTCCCTATG ATGGCTGGGA 913 6 

GTGGTCAACA TCAAAACTAG GAAAGCTACT GCCCAAGGAT GTCCTTACCT CTATTCTGAA 9196 

ATGTGCAATA AGTGTGATTA AAGAGATTGC CTGTTCTACC TATCCACACT CTCGCTTTCA 92 56 

ACTGTAACTT TCTTTTTTTC TTTTTTTCTT TTTTTCTTTT TTTTTGAAAC GGAGTCTCGC 9316 

TCTGTCGCCC AGGCTAGAGT GCAGTGGCAC GATCTCAGCT CACTGCAAGC TCTGCCTCCC 9376 

GGGTTCACGC CATTCTCCTG CCTCACCCTC CCAAGCAGCT GGG AC TAC AG GCGCCTGCCA 943 6 

CCATGCCCAG CTAATTTTTT GTATTTTTAG TAGAGACGGG GTTTCACCGT GTTAGCCAGG 9496 

ATGGTCTCGA TCTCCTGAAC TTGTGATCCG CCCGCCTCAG CCTCCCAAAG TGCTGGGATT 9 55 6 

ACAGGCGTGA GCCATCGCAC CCGGCTCAAC TGTAACTTTC TATACTGGTT CATCTTCCCC 9616 

TGTAATGTTA CTAGAGCTTT TGAAGTTTTG GCTATGGATT ATTTCTCATT TATACATTAG 9676 

ATTTCAGATT AGTTCCAAAT TGATGCCCAC AGCTTAGGGT CTCTTCCTAA ATTGTATATT 9736 

GTAGACAGCT GCAGAAGTGG GTGCCAATAG GGGAACTAGT T T AT AC TT T C AT C AAC T TAG 97 96 

GACCCACACT TGTTGATAAA GAACAAAGGT C AAG AG T TAT GACTACTGAT TCCACAACTG 9856 

ATTGAGAAGT TGGAGATAAC CCCGTGACCT CTGCCATCCA GAGTCTTTCA GGCATCTTTG 9916 

AAGGATGAAG AAATGCTATT TTAATTTTGG AGGTTTCTCT ATCAGTGCTT AGGATCATGG 9976 

GAATCTGTGC TGCCATGAGG CCAAAATTAA GTCCAAAACA TCTACTGGTT CCAGGATTAA 10036 

CATGGAAGAA CCTTAGGTGG TGCCCACATG TTCTGATCCA TCCTGCAAAA TAGACATGCT 10096 

GCACTAACAG GAAAAGTGCA GGCAGCACTA CCAGTTGGAT AACCTGCAAG ATTATAGTTT 10156 

CAAGTAATCT AACCATTTCT CACAAGGCCC TATTCTGTGA CTGAAACATA CAAGAATCTG 10216 

CATTTGGCCT TCTAAGGCAG GGCCCAGCCA AGG AG AC CAT ATTCAGGACA GAAATTCAAG 10276 

ACTACTATGG AACTGGAGTG CTTGGCAGGG AAGACAGAGT CAAGGACTGC CAACTGAGCC 103 36 

AATACAGCAG GCTTACACAG GAACCCAGGG CCTAGCCCTA CAACAATTAT TGGGTCTATT 103 96 

CACTGTAAGT TTTAATTTCA GGCTCCACTG AAAGAGTAAG CTAAGATTCC TGGCACTTTC 10456 

TGTCTCTCTC ACAGTTGGCT CAGAAATGAG AACTGGTCAG GCCAGGCATG GTGGCTTACA 10516 

CCTGGAATCC CAGCACTTTG GGAGGCCGAA GTGGGAGGGT CACTTGAGGC CAGGAGTTCA 10 5 76 

GG AC C AG C T T AGGCAACAAA GTGAGAT AC C CCCTGACCCC TTCTCTACAA AAATAAATTT 10 636 

TAAAAATTAG CCAAATGTGG TGGTGTATAC TTACAGTCCC AGCTACTCAG GAGGCTGAGG 10696 

CAGGGGGATT GCTTGAGCCC AGGAATTCAA GGCTGCAGTG AGCTATGATT TCACCACTGC 107 56 

ACTTCTGGCT G GG C AAC AG A GCGAGACCCT GTCTCAAAGC AAAAAGAAAA AG AAAC TAG A 10 816 

ACTAGCCTAA GTTTGTGGGA GGAGGTCATC ATCGTCTTTA GCCGTGAATG GTTATTATAG 10876 

AGGACAGAAA TTGACATTAG CCCAAAAAGC TTGTGGTCTT TGCTGGAACT CTACTTAATC 10936 

TTGAGCAAAT GTGGACACCA CTCAATGGGA GAGGAGAGAA GTAAGCTGTT TGATGTATAG 109 96 

GGGAAAACTA GAGGCCTGGA ACTGAATATG CATC C C ATG A CAGGGAGAAT AGGAGATTCG 11056 

GAGTTAAGAA GGAGAGGAGG TCAGTACTGC TGTTCAGAGA TTTTTTTTAT GTAACTCTTG 11116 

AGAAGCAAAA CTACTTTTGT TCTGTTTGGT AATATACTTC AAAACAAACT TCATATATTC 11176 

AAATTGTTCA TGTCCTGAAA TAATTAGGTA ATGTTTTTTT CTCTATAG GAA ATG AAT 11233 

Glu Met Asn 
85 

CCT CCT GAT AAC ATC AAG GAT ACA AAA AGT GAC ATC ATA TTC TTT CAG 112 81 
Pro Pro Asp Asn He Lys Asp Thr Lys Ser Asp He He Phe Phe Glu 

90 95 100 

AGA AGT GTC CCA GGA CAT GAT AAT AAG ATG CAA TTT GAA TCT TCA TCA 113 2 9 
Arg Ser Val Pro Gly His Asp Asn Lys Met Gin Phe Glu Ser Ser Ser 

105 110 115 

TAC GAA GGA TAC TTT CTA GCT TGT GAA AAA GAG AGA GAC CTT TTT AAA 113 77 
Tyr Glu Gly Tyr Phe Leu Ala Cys Glu Lys Glu Arg Asp Leu Phe Lys 
120 125 130 135 

CTC ATT TTG AAA AAA GAG GAT GAA TTG GGG GAT AGA TCT ATA ATG TTC 1142 5 
Leu He Leu Lys Lys Glu Asp Glu Leu Gly Asp Arg Ser He Met Phe 

140 145 150 

ACT GTT CAA AAC GAA GAC TAGCTATTAA AATTTCATGC C 11464 
Thr Val Gin Asn Glu Asp 
155 

(2) INFORMATION FOR SEQ ID NO : 14 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28994 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: Genomic DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(F) TISSUE TYPE: placenta 

(iX) FEATURE: 

(A) NAME /KEY : 5 ' UTR 

(B) LOCATION: 1.. 15606 

(C) IDENTIFICATION METHODS : E 

(A) NAME /KEY : leader peptide 

(B) LOCATION: 15607.. 15685 

(C) IDENTIFICATION METHODS: S 

(A) NAME /KEY : intron 

(B) LOCATION: 15686.. 17056 

(C) IDENTIFICATION METHODS: E 

(A) NAME /KEY : leader peptide 

(B) LOCATION: 17057.. 17068 

(C) IDENTIFICATION METHODS: S 

(A) NAME /KEY : intron 

(B) LOCATION: 17069.. 20451 

(C) IDENTIFICATION METHODS : E 

(A) NAME /KEY : leader peptide 

(B) LOCATION: 20452.. 20468 

(C) IDENTIFICATION METHODS: S 

(A) NAME /KEY : mat peptide 

(B) LOCATION: 20469.. 20586 

(C) IDENTIFICATION METHODS : S 

(A) NAME /KEY : intron 

(B) LOCATION: 20587.. 21920 

(C) IDENTIFICATION METHODS: E 

(A) NAME/KEY: mat peptide 

(B) LOCATION: 21921.. 22054 

(C) IDENTIFICATION METHODS: S 

(A) NAME /KEY : intron 

(B) LOCATION: 22055.. 26827 

(C) IDENTIFICATION METHODS: E 

(A) NAME / KEY : mat' peptide 

(B) LOCATION: 26828.. 27046 

(C) IDENTIFICATION METHODS: S 

(A) NAME / KEY : 3 ' UTR 

(B) LOCATION: 27047.. 28994 

(C) IDENTIFICATION METHODS: E 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



ACTTGCCTTA AAAGCTTTGC ATAGGTAGAC AACATTAGAT TAATTTCCTT GCTCACATCT 6 0 

GTTCAAGAAA AATCATTTAA GTTATAAAAT ATAACAAACC TTCTGCATTA TAAGACTGAT "12 0 

GTTTAGAAAT ATAAACATTT TATACATCAC CATTTAAATC TTTCTCCAAG GCTTCATCTT 180 

TATAAAATAG TCCGGAAATT TCAGAGAAAG ATGAATCTGA TTTTCCAAGA GAGGACAGCT 240 

GTGGACTATC TGGCACTGGA GACTAAATAA AGAAAGCAGG TACAGTCAAT AAGATCTTCA 3 00 

GGACATATAC ATTTTGTTTA TTAAGAAAAA GCAAATAAAA CATTTTTCAG AAAAAGGCAA 3 60 

ACATGCTAGA AAG C AT AT G A CTTAGTCATT TGAGTTTTTA TTATTAAGGA AATTTACAGG 42 0 

CCCAAGAAAC ACCTTGCTCA AT AT AT T AAA TTTTATTTTG GTTTTCAACT AGACTTTGCT 48 0 

TTTCATTTGT TTGTTTTTGT GACAAGTTCT CGCTCTGTCA CCTAGGCCAA AGTGTAGTGA 540 

CACAATCTTA GCTCACTGTA GCCTCCTAGA TTCAAGTGAT CCTCCTGTCT CAGACTCCTG 6 00 

AGTAGCTAGG ACTACAGGAA CATTCCACCA TGCCCAGCTA ATTTTGTTTT GTTTTGTTTT 6 60 

GTTTTCAGAG ACAATGTATT GCAGCGTTGC CCAGGCTGAT CTGAAACTCT TAGCCTCAAA 72 0 

CGATACTCCT GCCTCAGCCT CCCAAAGCAC TAGGATTACA GACATGAGCC AATGCGCCCA 780 

GCCTTAAATT AGACTTTAAA TGTGGTTTTA AACTCCTGTT GAAAAAGCGT CTGGTATCTT 84 0 

GAACCAGTAG ATGTTTTCAT AGCAATGAAG CTAAACTGTA ATTTAGACAG TAGCCAAATG 900 

CTTGTGAAAT TTTGCTAAAT AATATAATCT TCAAGGGAGC AAATCATGTC CCAAATGCAA 960 

AAGATCAACT GGTGGGGGCA GTAGTAAAAG AC AGG AT AC T GTGCTCTTTA AAAGGTCAGT 1020 

AACTATAGTA CCTAGTTATC TTACTTATCA CAGCAAAATA ATTACATAAA ATCCTATGGA 1080 

TCATAAAGGC ACAGACTCAC TTCTGTCTCT AGATCTCAAG CTACCAAAAA GAAATCTCCC 1140 

AATAGTTTCT TGGAGGCCTA TACTTAGTGA AAAAGCAGCT GGAATCAACA TAGTTCCTCC 1200 

TATGTTGTAG GACAATCCTA GCTCTGGGCA TACGAATACA TTAAATCCCA CTTATCTATA 1260 
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GAGCTTTCTT AAAGGGAAGA AATTTGAGTA GTATGTAAAA CAGAATAAAA GATTAAGGCT 13 2 0 

CCATAGGCAT ACAGCTTACC TCCAATTCTC TTGGCCTCTT GCAATTTCTA TTATCAGGCT 13 8 0 

TTACAAGGTG ATTTGCCATC ATATTCCGAA GGC AC CAGCT ACAAAGCTTA GAACAATGCC 144 0 

AGATTTAGGT ACAAACTCCA TGCTACAAGC TCTCTGGAAT CCTTCCCTGT TTCCCACTCC 1500 

TACTGCTGAT GTTAATTTAG ACTGTCATTA TCTGTCACTT TCCTAAACTC AATTTCTCCC 156 0 

TCCTCTAAAT CATTGTATCA ACTGCTATTT GGGTAATCTT TCAAAACTTT GATTACTGCA 16 20 

TTCCTTTAAC TCAAAAACTT TCATTGTTCC AGAATAAGTT GAAATTCCAT GATATGGCCT 168 0 

TCAAGGTCCT GTATTATCTG GTGCAAGCCT ACTAGTCCCA TCATTTTCAA CTACTCCTCT 174 0 

CTATGTACTT AGCCAAATGA GTCTCTCTGG CAATTCTGCC TTGTTTCAGG ACTGGGTCAG 18 00 

TTAAGATTCT TTTATCTTCG GCCGGGCGCG CTGGCTCACG GCTGTAATCC CAGCACTTTG 1860 

GGAAGCTGAG GCAGGAAGAT CACCTGAGGT CGGGAGTTCG AGACCAGCCT GGCCAGCATG 192 0 

GTGAAACCCT GTGTCTACTA AAAATCCAAA CATTAGCCAG GCGTGGTGGC AGGCGCCTGT 198 0 

AATCCCAGCT ACTTGGGAAG CTGAGGTGAG AGAATCGCTT GAACCCAGGA GAGGGAGGTT 2 04 0 

GCAGTGAGCC GAGATTGTGC CATTGCACTC CAGCCTGGGC AACAGAGCGA GACTCCACCT 2100 

CAAAAAAAAA AAGGATTCTT CTATCTTCAC AAAATCTTAA TGTTTAAACA GGTCTTACAG 2160 

TTCATCTAAT TCAATCTCAT TTTTTACAAG TGAGAAAACA GGGACAGTGA CGGTGGATCA 2 22 0 

AGTGACACCA GTAAGACTGA GCTAAATTAG AACCGAGATC TCACTCGAGT CTGAGGTTAT 22 80 

TCCCACTGTC CAACCTTACT TTAAAGTAGC TTCAAATTTT ACTTTTACTT TTCCATAAAT 2340 

TCGGAAGGGA TTTTCCCTAG GAGTCCAAAT GTTGAAACCT GGAAGGGTAT AGTCTCTGTG 24 0 0 

TCTTTGAGAT GAGGGGAGCC CTGTCCATAT TCAAGTTATC AATTGACTTT GTTGTTTTTG 246 0 

AGAAACGATG CTGATTTGGG TAACTTTAAC ACATCTGTTT GATTAGTCCT ATAAAATATG 252 0 

CATATATAGA AGACAGAAAG AGCAACAACA AATTTGAAAG ATGCTTGTTA AGTAAATTCT 2580 

GTATCGTACG TGTCCATTCC TGCCAGTACC TTTATAGTAT GTAAGTTTAC GTGCTGTAAT 2 64 0 

AGTATTAATA GTATCTAGAA AATACTACAC ATGCACAGCA GTGCTAACTT TGCCTTGGGA 2700 

GTTGGAAAAT AC T T C AG AG A AGCCAACAGG CAGATTTTTC TCTCTTCCCT TCCCCTTCTA 2760 

ATTTTCCCTT TCCCCTTCAC CCCCTTCTCT TCTCTCCCCA AGTAACACTG TGCACCTATG 2 82 0 

TCAAACGAAA ACTTATAATC AAGTAACTGT TTCTGCAAAA ATAAGTTCGT TTTCCTGTCA 2 88 0 

TGGCTCAAGG CCTCAGCAGA TCCAGGCCTG GTGGACGGGC TGGTCTTCGT CGTGTGCCAA 2 940 

ACACTGACCA CTGCCCTGGC TCTGCCATCT TAGGCTTAGT GACCTGGCTG TTACTAAGCA 3 0 00 

CTGTCCCCTC TGCCCCATGC AGCTGTCTCC TTCTAGTCTT CTCCCTCTTC TCAACGCGAT 3060 

CCTAGCCCCT CAGGCCATTT CACCTCCATT TTCCCTCACT TCCCGCCGCC CCTCCGCACT 312 0 

TCCTCCCTAC TGTTGTTTCC GCCCCACTAG AGCCCCTCAG AGAAAGTTTC CATCCTCGCA 318 0 

CCCTTCCTTG TGTCACAGCC CGTCACATTC TCACAGGCGC CCATCCCTCC AGCCCCACCC 324 0 

CAAGGCCAAT GTACTTCGCG GTATGGGGAC CTTCCTCGTC AGCGAACGCG AGGGAGTGAA 33 00 

GACCCTGGGC GCGGGGTGCT CGGACTTCGG GGGTGGAGGT GGGAAGCGCG CCGCACTCCC 33 6 0 

AGCAGCCCCT GCACGAGTCA CGTGACAGCT CTCCCACCAC CACCCCCCCC AACTTCCCCA 342 0 

CCGTAGCCTC CCAGAGCCAG GCCCCACGGA AAGGCAGCTT TTTCCCGGTT TTCTCCCGCT 34 80 

CTTTCCCCTC CACTTGGAAT ACTCGTGAAA CAAAAATCTC TCCCTGCCAC CCTGTGTGTG 354 0 

TTTGAACCAG GAAAAAATCT GAAACTGGTC AAGAAAGAAC AAGGAAGACT TGCCAAAGCA 360 0 

AGGCCGGTGT GTGTCCCAGC AGCTTAGAAT CTCAGCAAAG GAACACAAAA TAGCACATCC 3660 

ACGGCCTCTT TTCGAGTAAA ATTTACTTGG TTTGTTTGCA GGAAGGGTTT AAAACTGCGT 3 72 0 

TTGCAGATGC TCTGTTTGCA GGAAGGCTTT AATCACGTGT TCCCCTGGCC CACAAGCAAG 37 8 0 

GCTTTTAGAT CCAGAGCCTC AGTTACTGCC CCCTCTTCCT CTTTGGTGCA ACCAAACGTT 384 0 

CAGAATCACG CCTTCTTAGA AAATTCTTAC CCCGGGTGTG TCAATAAGTT AAGTCTAATT 3 90 0 

GGCAACAGCT AT C AAAAAGT GTTGCATAAC ACACATGGCT CACATAATTG TAGCTTTGCC 3 96 0 

TCATCGGGTG TTTTAATGCG GAGGCTTTGA CCTGCAATTT CAAAGATATA CATTCCAAGC 4 02 0 

TTACGCCCAG TTAGTGGATG TGGAAGAAAA AAAAAAGCAA ATTACCTCAT AACACAAAGG 408 0 

TCAATAACAC ACATCCATAA GCTC CAGGTA CAAAATCTTA CATCTTAGAG AACTATATTT 414 0 

AACATTTACA TACATTACTA AGGTTTTTTT TTTCCTTTTG CTTGATTAAA TGTTAGTTAT 42 0 0 

CATTAAGTCT TGGAATTATT CTGTGTGTGT ATATTTATTT GCTGTTTGTG AAGAAGCCGG 4260 

TTGTTTTAAA TAAGTTCCTA GAAAATAAGC GCTCAATGTG TTTAATCTGA GTTG CTAAT A 4 32 0 

TTGTGAAATA TAGGCCACAT AAT AC TAG C C TAGATAACTA TGGCGAAGTA AGGAGTCTCA 43 80 

AACACTGTCC CAGAAGAATA GCAATCTGTG TTGAATTTTT ACCCTCTGTG GTAAAATGAA 444 0 

GGGAAAAGGA ATGAAGTTTT AGTTTGCCTT AATTTTTATC TTTATTGTTT CAGACTCTTC 4 50 0 

AGCAGTATAA AGTTTTCATC AAGTCAAATA TATTCACTTT AAAGTGACTG TGCTTTATTC 4 56 0 

TGATACCATG TCCTTCCTAA TTTGGGGGGC CAGGTGAGAT AAGTTTTATG AAATAAAAAG 4620 

ATTAAAAATT CTTACATTTT TAGTGTCCTT CCTTGGTAAA ATGTAGAGTT GTCCACTGTG 46 8 0 

TTTATCTCCT CCTCCTTATT ATCATGGTTG CTGTTATTAT TTTTAATGGT TCATTAAACC 474 0 

CAAGGGTCTG GGAAATACTC ATGGAATTCA TCTCACAGCC TTCACACTGT ATGATATTTA 4 8 00 

AACAGGTGGT TGTCCATCTG ATTCTTAAAA TATTTCCAAG AAAAATGATT CCACCTAATG 4 860 

CATAAATGCT TTCATCAGAT TAAGAGAACA CCATGGACAT TTTATTTTAT TTTATTTTTT 4 92 0 

AAATATTAAC TTCCATTGCA TAAGCTAAAT GGGTAGGAAT AAGTGAGATG ATATTGTTAT 4 980 

CTAGAGCTTT AAAATATTCA AAGGGCTGTC ATCATTATCT CATTTAATCT TTGAAAACAA 5040 

CTCTATGAAG TACAAAGGAC ACTGAGACAT TTGTTGCTCT ATATCAAAGA AAAAAGTGTT 5100 

TGTCCCAAAA CTTCAAAATG TGTAAATTAC ACATTCTGCA TCTTTACAGC TGGAGAAAAT 5160 

TCACTGGCAA TGGAATATTT AAAATTAGAG CTTGCTTAGT GTGCTGCTTC TGATCACTAC 522 0 

TTGATCCCAC TTCGTGCTTT CATGTTAATT GGCCCAATTG GACTCTACAG TTGGAAGGTG 5280 
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AAAACTTACT ATTTCAACTT GAGTCACGTA TGTATTCTTA TCATATACTT CTTAAAGGTA 534 0 

CTATTTTTTT TCTTCTGATA GTCACCACAC CAAGCACTTC CAGCCACCCT GCCACAGACT 54 0 0 

TCCTTTGTAA TCACTGTTGA AGGACATGAT GTTTTTATGA CTTCCCGAAA TGAAAACCCT 54 6 0 

ATCTTGTTTT TAAAACAAAC AAACCAACAA AAAGTAGTGT TTATGTAAGC ATTTTGTTCC 552 0 

CTGACTCTAG GAACCCCTCT GTTTTTATAT CAACTCTGTA CTGGCAAAAC ACAAAAACAA 558 0 

AATGCCACCT TGCTAATTCC CTTCCTAGCA AAGTAATACA GTTTAGCACA TGTTCAAGAA 564 0 

AAAAATGGCT AAGAAATTTT GTTTCCACTA ATTATTTTCA AGACTGTGAT ATTTACACTC 5 70 0 

TGCTCTTCAA ACGTTACATT TTATAAGACT ATTTTTTAAC ATGTTGAACA TAAGGCCTAA 576 0 

ATATATGTAT CCTTAAATTG TATTTCAAAT ATTTTAGGTC AGTCTTTGCT ATCATTCCAG 5820 

GAATAGAAAG TTTTAACACT GGAAACTGCA AGTAAATATT TGCCCTCTTA CCTGAATTTT 58 8 0 

GGTAGCCCTC TCCCCAAGCT TACTTTCTGT TGCAGAAAGT GTAAAAATTA TTACATAAAA 5 94 0 

TTCTAATGAT GGTATCCGTG TGGCTTGCAT CTGATACAGC AGATAAAGAA GTTTTATGAA 6 00 0 

AATGGACTCC TGTTCCACTG AAAAGTAAAT CTTAATGGCC TGTATCAACT ATCCTTTGAC 606 0 

ACCATATTGA GCTTGGGAGG AAGGGGAAGT CCTGAATGAG GTTATAAAGT AAAAGAAAAT 612 0 

ATTTGCAAAA TGTTCCTTTT TTTAAAATGT TACATTTTAG AAATATTTTA AGTGTTGTAA 618 0 

CATTGTAGGA ATTACCCCAA TAGGACTGAT TATTCCGCAT TGTAAAATAA GAAAAAGTTT 6240 

TGTGCTGAAG TGTGACCAGG AAGTCTGAAA AT G AAGAGAG ACAGATGACA AAAGAAGATG 630 0 

CTTCTAATGG ACTAAGGAGG TGCTTTCTTA AAG T C AG AAA GAG AT AC T C A GAAAGAGGTA 6 36 0 

CAGGTTTTGG AAGG C AC AG A GCCCCAACTT TTACGGAAGA AAAGATTTCA TGAAAATAGT 642 0 

GATATTACAT TAAAAGAAGT ACTCGTATCC TCTGCCACTT TATTTCGACT TCCATTGCCC 64 8 0 

TAGGAAAGAG CCTGTTTGAA GGCGGGCCCA AGGAGTGCCG ACAGCAGTCT CCTCCCTCCA 6 54 0 

CCTTCTTCCT CATTCTCTCC CCAGCTTGCT GAGCCCTTTG CTCCCCTGGC GACTGCCTGG 6 600 

ACAGTCAGCA AGGAATTGTC TCCCAGTGCA TTTTGCCCTC CTGGCTGCCA ACTCTGGCTG 6660 

CTAAAGCGGC TGCCACCTGC TGCAGTCTAC ACAGCTTCGG GAAGAGGAAA GGAAC CTC AG 6 72 0 

ACCTTCCAGA TCGCTTCCTC TCGCAACAAA CTATTTGTCG CAGGTAAGAA ATATCATTCC 67 8 0 

TCTTTATTTG GAAAGTCAGC CATGGCAATT AGAGGTAAAT AAG CT AG AAA GCAATTGAGA 6 840 

GGAATATAAA CCATCTAGCA TCACTACGAT GAGCAGTCAG TATCAACATA AGAAATATAA 6 90 0 

GCAAAGT CAG AGTAGAATTT TTTTCTTTTA TCAGATATGG GAGAGTATCA CTTTAGAGGA 6 96 0 

GAGGTTCTCA AACTTTTTGC TCTCATGTTC CCTTTACACT AAGCACATCA CATGTTAGCA 7020 

TAAGTAACAT TTTTAATTAA AAATAACTAT GTACTTTTTT AACAACAAAA AAAAGCATAA 708 0 

AGAGTGACAC TTTTTTATTT TTACAAGTGT TTTAACTGGT TTAATAGAAG C CAT AT AG AT 714 0 

CTGCTGGATT CTCATCTGCT TTGCATTCAG ACTACTGCAA TATTGCACAG AATGCAGCCT 7200 

CTGGTAAACT CTGTTGTACA CTCATGAGAG AATGGGTGAA AAAGACAAAT TACGTCTTAG 726 0 

AATTATTAGA AATAGCTTTC ACTTTAGGAA CTCCCTGAGA ATTGCTGCTT TAGAGTGGTA 732 0 

AGATAAATAA GCTTCTCTTT AAACGGAATC T C AAG AC AG A ATCAGTTACA TTAAAAGCAA 73 80 

AC AAAAAAT T TGCCCATGGT TAGTCATCTT GTGAAATCTG CCACACCTTT GGACTGGGCT 744 0 

ACAATTGGAT AATATAGCAT TCCCCGAGAT AATTTTCTCT CACAATTAAG GAAAGGGCTG 7500 

AATAAATATC TCTGTTTGAA GTTGAATAAC AAAAATTAGG ACCCCCTAAA TTTTAGGGCT 7 56 0 

CCTGAAATTC GTCTTTTTGC CTATATTCAG CTACTTTACG TT CT ATT AAA TCTTCTTTCA 762 0 

GGCCAGGTGC ACTAGCTCAT GCCTAGAATC TCAGGCAGGC CTGAGCCCAG GAATTTGAGA 76 8 0 

CCAGCCAGGG CAACACAGTC TCTACAAAAA AAT AAAAAAT TACCTGGGTG TGTTGGTGCA 7 74 0 

TGC CTGTAGA ACTACTCAGG ATGCTGAGGA CTGCTTGAGC CCAGGATAGC CAAATCTGTG 7800 

GTGAGTTCAG CCACTAAACA GAGCGAGACT TTCTCAAAAA AACAAACAAA AAAACAAACA 7 86 0 

AACTTCCTTC AAAATAACTT TTTATCTGCA ATGTTTTCCT ATTGCCTGTG AGATTAAATT 7 92 0 

TACTCTTTTA CCTGATTTCC AAAGCCCTCC AT AAT C T AAT CCGACTTTAC CTTGTGTTCA 798 0 

CTGCAAAATA GCAGGACTGT TCCACTACAA TCCAAAAATC ACAGGTTGGG TGCAGTGGCT 804 0 

CACTCCTGTA ATCCCAACAC TTTGGAAGGC CAAGGCAGGT GGATTGCTTC AG C T CAG GAG 8100 

TTCAAGACCA GCCTGGGCAA CATGGCAAAA ACCCTGTCTC TCCAAAACAT ACAAAAATTA 816 0 

GCCAGATGTG GTAGTATGTG CCTGTAGTCC CAACTACTCA AAAGGCTAAG GCAAGAGGAT 822 0 

CACTTGAGCC CAGGAGGTCA AGGCTACAGT GAGCCATGTT TACTGTGTCA CTGCACTCCA 8 28 0 

GCCTGGGTGA TAG AG C AAG A CCATGTCTCA AAAAAAAAAA AAAGAAAAGA AAAGAAAAAA 834 0 

ACATCGCTCT ATTCAGTTCA CCCCCACCAC AACATTGTTT TGATTATCAC ATAAATGCTG 84 00 

GTCCATTGCC TTCTCTATCT ATTCAAATCT TTAAGCATTC TTTGAGATTC AACTCAATTC 84 6 0 
TCCTTTTCAA ACT AGGC CAT TTAAACTACA TCAGTTCCAT TTTGATTTTC TTGCTTTGAG 8 52 0 
TCTACAGACT CAAAAACAAA AACTTAAAAA CTTATTTTTT AAGTTTTCTG CTACTCTCAC 858 0 
TTCTTCAACA CTC AC AT AC A CGCATTCATA ATAAGATGGC AGAATGTTCA AGGATAAAAT 864 0 
GATTTATAGA ACTGAAAAGT TAGGTTTTGA TCTTGTTGCT GTCAAGATGA CTACCTACCT 8 7 00 
GATCTCAGGT AATTAATTAT GTAGCATGCT CCCTCATTTC ATCCCATACC TATTCAACAG 8 760 
GATTGGAATT CCACAGCAAG GATAAACATA ATCATAGTTG CTTTTCAAGT TCAAGGCATT 8 82 0 
TTAACTTTTA AT C TAG TAG T ATGTTTGTTG TTGTTGTTGT TG T T T GAG AT GGAGCCCTGC 888 0 
TGTGTCACCC AGGCTGGAGT GCAGTGGCAC GAACTCGGCT CACTGCAACC TCTGCCTCAT 894 0 
GGGTTCAATC AGTTATTCTG CCTCAGTGTC CCAAGTAGCT GGGACTACAA GGCACATGCC 9000 
ACCATGCCTG GCTAATTTTT GTATTTTTAG TAGAAACAGG GCTTCACCAT GTTGGCCAGG 906 0 
CTGGTCTCGA ACTCCTGACC TCAAGTGATC CAGCCGCCTC GGCCTCCCAA AGTGCTGGGA 912 0 
TTACAGGCAT AAGCCACCGT GCCCAGCCTA ATAGTATGTT TTTAAACTCT TAGTGGCTTA 918 0 
ACAATGCTGG TTGTATAATA AATATGCCAT AAATATTTAC TGTCTTAGAA TTATGAAGAA 924 0 
GTGGTTACTA GGCCGTTTGC CACATATCAA TGGTTCTCTC CTTACAGCTT TAATTAGAGT 930 0 
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CTAGAATTGC AGGTTGGTAG AGCTGGAACA GACCTTAAAG ATTGACTAGC CAACTTCCTT 936 0 
GTCCAAATGA GGG AACTGAG ACCCTTAAAA TTAAGTGACT TGCCCCAGAC AAAACTGGAA 94 2 0 
CTCATGTGTC CTAATTTCCA TCATGAAATT CTACCATTCA CTAGCCTCTG GCTAGTTGTC 94 80 
AAAGTATTGC ATAACTAAAT TTTTATGTCT GTTTTAAAGA ACAAATTGTC ACTGCTTACT 9 54 0 
CCTGGGAGGG TCTTTCTGAG GTGGTTTATA ACTCTTAAAA AAAAAAAAGT CAGTAGTCTG 9600 
AGAATTTTAG ACGAAATAGT CAAAGCATTT TTATCCAATG GATCTATAAT TTTCATAGAT 966 0 
TAGAGTTAAA TCAAAGAAAC ACGGATGAGA AAGGAAGAGG AAAATTGAGG AGAGGAGGAA 972 0 
TGGGGATGAG AAC AC AC T AC TTGTAATCAG TCATAGATGT ACTGAGAACT AACAAGAAGA 978 0 
ATTGTAAGAA AATAAGAATG AAGAATTCAA AATCAACACA TGAAATAAAA AGAAACTACT 984 0 
AGGGAAAAAT GGAGAAGACA TTAGAAAAAT TATTCTATTT TTAAAATTCT GTTTTCAGGC 9900 
TTCCCTCCTG TTCTTCCTCC TTCTCATTGG TTTTCAGGTG GAGGGAAAGT TTAAGATGGA 9 960 
AAAAATATAT ATATTCTACA CATCCCTTTC TACGCTGTTG TCATGGCAAC AAGGTTTATC 10020 
ATAGCAAACT TTTATTCATA CAACATTTAT TGAGTTCTTA CTGTGTGGTA AGCTCTTTCC 10 080 
AGGTGTTGAA AATTCAGGGG AAAAAAGACA ACTCATTGTC TTAAAACTCA GATGAAAGCT 10140 
GAACAGACCT ATTTTTAATC AAAGTAATCT CAATTTAGGG TAGTAAGAGC TATTTAAGAA 10200 
GCATGAACAG GTGTGAAGGA GGTAGGACTC TGAGGAGAGA ATAGTTAGCT AGGAATGAAA 10260 
GAGCAGAGAA GTTTTCCTAG AGGAACTATT AAAGCTGGGA GTTACGGGAT GAAAGATGAG 10320 
GCAGGGTTTG CAGGCAAAAA AAAAAAAAAG GCAGGGGAAG GGGAAGTTCT GGCCTGGCAG 103 80 
AGAGAATAAC TGTGGCAACA ATGGAGGAGA GTCTGGAAGC AAGAAAACCA AGTAGAAGAG 104 40 
TATTAAAATA GAAGATGCCA GGGGTAATGA GGGCTTGATT TAAAACAGTG CTGTTGGAGA 10500 
TGGAGAGGAG AT AC C AAATT CTGGAGACAT TTCTGAGTTA GAACCTACAG T ATT TAT C AG 10 560 
ACAAGGGAAA GATTAGACAA AGGAGTTAAG AATGACTCCC AGGTTTCAGT TTGGGGCAGG 10620 
TAACTAGGAC ATGTTTTGAA AAGTAATGTA TTGGATCTCT TACCATTGGA ACTATGTATG 10 6 80 
TGGAGC C AAA TTAAAATTTG TACATGTATA TAACTCTCCC CCCACCACCA GTAACTACTT 10 740 
CCCTAACTCT CTACTTTGTA GCCAGACTTC CTAAAAGAAT AGTTTGTAGT CACTGTCTTT 10800 
ACTTTTCCCC TCCCATTCTG TCCTAGATAT TTGTCCACCT ACCATCTGCT GCCTCCACTT 108 60 
TACCCAAACT GTTCTACGGT TGCCCAAAAC TTCCTAATTG CCAAATTCAA TGAACAAGTT 10920 
TAAGCTTATA TGTAAATTAG GAGCTCTACA GTTTGATTTC GAGCAGCCCC TCCTGAAACC 10980 
CTTTCTCTTT CGACTTCTGT GACACATCTC AGATTTACAA AACTGAACTA ATTATTTTAC 1104 0 
ACTTGAGCTG TATTTTCGTT CTTCTTTCTT GATGAATGAG GTAACCACTC AACAAATTGC 11100 
CCAAGCCAAA AACTACGAAG TCATCCTCAG TTCCTCCTTC TTCTGTTTGA CCCACAACAG 1116 0 
ATCAGCTGAG AAATCCCGCT GTTTAGTATC TCTTGAATTC ATTACCTTAA TTTATAGCCT 112 20 
CATCAACTCT TAATTGTTAA AATTACTTCA GTAGTTGTTG TCTGACCTCT GTCCAATCTT 1128 0 
GTTCAATCAG GTCCATTCTT TTGTTCTTGG TGGTGGTGGT GGTGTTGACA GAGTTTCGCT 1134 0 
TTTGCTGCCC AGGCTGAAGT GCAGTGGAGC ACTTCACTGC AACCACAGCC TCCTGGGTTT 114 00 
AAGCAGTTCA CCCTCCCGAG TAGCTGGGAC TACAGGTATG TGCCACCACA CCCAGCTAAT 11460 
TTTGTGTTTT C AG TAG AG AC AGGGTTTCAC CATGTTGGTC AGGCTGGTCT CAAACTCCTG 1152 0 
ACCTCAAGCA ATCCACCCAC CTCAGCCTCC CAAAGTGCTG GGATTACAGG CATGAGCCAC 1158 0 
TGCACACGGA CCAGATCCAT TGTTTATGTT GCTTCTAGAG TGAGTTTTTA AAACACAAAT 1164 0 
TTGAC CAT AT CTTTCTCCAA TTTAAGTCAG TATTTTTTTT TTCAGGAAAA AACAGTTCAA 11700 
ACTCTTTAGT CTGCTTACAC AAGGCCTTTG TAGTCTGACT CTTCTTTCCA AGCTTTCATC 11760 
AAAGTAT AC T GCAAGTTACA TTTTATGTGA ATTGAATTAG GCAACGGTAT AAAAATTATA 1182 0 
GTTTATATGG GCAAAATGGA AATAATGTTA ACTCTTCCAA ATAGTTTATC TAGAATGACA 118 8 0 
TAATTTCAAA GCTGTCAGGT CAAATGAGTT ATAAACTGTT AACACTATTG CCACATGCAA 11940 
GTGTCTCTTA TACTTGGTAG AATTATCTGC TTCCATGTCA TTATTATGTA AATTAGACTT 12 000 
TAAATAACTC AGAAGTTCTT CAGACATACA GGTTATTATT GTGCTTTTTA AACATAATTT 12 06 0 
TAAATAATTT TATATATGAT AATGTTATCC AAGTGCTAAG GGATGTATTG TTACTGCTGT 1212 0 
GCAAAAAAAA AAAAAAAAAA AACTCCAAAT AAATATGTTG AAACCAAGTT TATATGCAAG 12180 
AAAACAATAT TAAAAAGGCC AAAGT AC C AC CATAATAGGC TGTGTGGAGA CGGCAGGCTA 122 4 0 
CAAAACACTA GTAATAATGC TGAGAAAGTT GAAAAAAGAA AGAAAGCAAC AATATGCTTT 123 0 0 
GGTTGTTGTA GGTTTATGTA CTCCAAGAAT ATCTCCTCTC AAACTTTTAC GTTTTTTCCA 12 36 0 
AAGAAAAGTT AACTTTGGCT GGGCGCAGTG GCTCTTGCCT GTAGTCCCAG CCTTTGGGAG 1242 0 
GCCAAGGCGG GCAGATCACC TGAGGTCAGG AGTTTGAGAC CAGCCTGACC AAAAATGGAG 124 8 0 
AAACCCGCCC CCCTCACTAC TAAAAGAATA CAAAATTAGG CCGGGCACAG TGGCTTACCC 12 54 0 
CTGTGATCCC AGCACTTTGG GAGGCCGAAG CAGGAAGATC ACCTGAGGTC AGGAGTTCGA 12 60 0 
GACCAGCCAT GGAGAAACCC GTCTCTACTA AAAATACAAA ATTAGCCGGG CGTGGTGGTG 12660 
CATGACTGTA ATCCCAGCTA CTCAGGAGGC TAAGGCAGAG AATCACTTGA ACCCAGGCAG 12720 
TGGAGGTTGC AGTGAGCCGA GATCGTGCCA TTGCACTCCA GCCTGGGCAA CAAGAGCGAA 12 780 
ACTCTGTATC CAAAAAACAA AAGAAAAGAA AAGGTAACCT TGAACTATGT GAGATCTTTA 12 840 
GAAATGCATT CTTTCTGTAA AATGTGACTA CATTTGCCTT ATTTATGGTA AAAATGTTGA 12 900 
GGCCTCAAAC AACCCATATT TTCTCGGTCT CCCCGCTGCC TAGCCTTTGT TCACATTGCT 12 960 
TCTTCTTGGT GGAAGCTCTT CCTCTGGCCT TGAAAATGCC TGCTTCTCTT TCAAGGTAGC 13 020 
ACAGTCATCA CTTTCTGTGG TAACCTTCTC CAGCACCATC AAACAGAAAG AATGAATCTC 13 080 
TTGTAAATTC AGCTCTTACG TCATTCATTA CATTATTTTG TAACTCTTTA TAGATTCTTC 13140 
TCTCCCACTA GACTCTGAGT CACTGGAGAG TAGGAGCCAA CTCTCATTCA TGTGTGGTTT 13 2 00 
GGTCAGCTAC TGGCCACATT CCTGATGCAT AGTTAATGCT CAAACCTTAA CTGGTGAATC 13260 
AGCTCAAATA TTGTCCTTCT CTAAATCCAT TCACTCATTG ACTAACTATG TACTCAAAAT 13320 



- 49 - 



AGTAAAC AC C AGTAATTTAA TCCAATTCCT GCCCATACTG CTTGGTACAT TTCAGGTGAA 1338 0 

TTAGTTTGAT AAATATGTGT GTATTACATA ATATTAAAGT ATGTACAGAA GATCATGCTA 134 4 0 

ATCATAATTC ACAACTGATA ACTAATCAAA CATAAATGCT CTCAGGTTAA CAAATGTCTG 13500 

CCTTCTCAGT TAATGCAGTC ATTAACAAAC ACCTTCTGAT GCTGATAATA GGGCCTTGTT 13560 

CAGCAATGAA GCCATAAAGG TGAATAAAGA ACATGCCCTC GTGGAGCTCA CAGCCTAGTC 13 620 

ATTATTGTTG TGATTTTTAA TATTAATGTT GGTTTGGGTT TTGGTGAAAA ATGTTTAGAC 13 68 0 

TTATCTTAGT GATCTTTTCA TCCTTTGCTA TATTATTTTT CTCTAAGAGT CTTCCTTATC 1374 0 

CCCTCCTTTA AAAAACTAGG TGATAATTCT AAATTGTAAA TTTAAATATT ATAAATAGCT 13 8 00 

TATAAAATTT AATATTTATA ATATTTAAAT GTTTGATAAA TATTTAAATT TTATAATATT 1386 0 

TAAATGTTTA TTTAAATTCA TTTGTACATC AGTTTTTATT TTATTTAAAT GTGTTGGCCA 13 92 0 

GGCATGGTGG CTGACACCTA TAATCCCAGA ACTTTGAGAG GCCAAGTCAG GCAAACCATT 13980 

TGAGCTCAGG AGTTTGAGAC CACCCTGGGC AACGTGGTGA AACCCTGTCT CTACCAAACA 14 04 0 

TATGAAAACT TATCTGGGTG TGGTGGCACG CATCTGTGGT CCCAGATGGG AGTCCCAGGC 1410 0 

TAAGATGGGA GAATCGCTTG AACCCAGGTG AGAGGGGTGG GGTGGATGTT GCAGTGAGCT 14160 

GAGATCGTGC CACTGCACTC CAACCTGGGT GACAGAGTGA GACTCCATCT CAAAAAAAAA 14220 

AAATGTTATC TAAATAAGAT AAATTTAATA ACTGTTCGCA CTTAGATGAG CATAAGGAAC 142 8 0 

TAAACCTAGA TAAAACTATC AAATAAGGCC TGGGTACAGT GACTCATGCC TGTAATCTCA 14 34 0 

AGCACTTTGG GAGGCCAAAA TTATACAAAG TTAGTTGTAT AACACCAACT AACAACTATT 14400 

TTGGGGTTAG CTTAATTCAG ATTAATTTTT TTTAAACTGA GTTTTAAATT CCTGCTTACT 14460 

CTACCATACA TGCTAGGCCT CATATTATGC TAGAAAAATT TTGAGCACAG ATTTATGAAT 1452 0 

ACTCTCCTGC ATACCATTTA ATTTTTAAAC AAATTTTAAT G C AGT AT AT A TGTGCCTTTT 14 580 

TACCAACACA TTAAATAATA AGATCTACTG TGAGGACTAA ATTTCTGTAA TTTCAAAGTA 14640 

GTAATGAGTT TAAACCATGT CTCAAGATCT CTGCAATAAC TGTAGCACAA CAGAAAATAG 14 700 

GTATTTCTAT TAATGACAGA GTCACAAGTA CTACTAATAA TACTGTGGTT TGTTTCCTGC 14 760 

AACTAATCAT GGGAGGAATG CTAAATTTCA GAGGTTGGTG AAAATACATG TGTATTTTTT 14 820 

TCCCCATCCA AGTTCACAGA TTTCTCACAC TGAGAACTCC TATTCCATAA CAAAATTCTG 148 8 0 

GAAGCCTGCA CACCGTATTG GAAGAAGGGC AGAAAGGAAA AGCAAATGGA AGGATTTAAA 1494 0 

TTTTTTTCAA ATCCTGTATC CCTTGATTTT ACAGCAAGAT TGTATTTATG TATTACTTGT 15000 

GTTAAAAATA TAGTATAATC GAG AC T C C AG AT C AAAAAT C ACCGCAGCTC AGGGAGAAAG 15 060 

AGGGCCACCA AATGC C AG AG CCCTTCAGCC TTCTCCCACC CTGCCTGTAC CCTCAGATGG 15120 

AAGCACTTTT TTATCATTGT TTCACCTTTA GCATTTTGAC AATGAAGTCA CAAACCTTCA 1518 0 

GCCTCTCACC CATAGGAACC CACTGGTTGT AAGAGAAGGA TGAAGCCAGT CCTTCCTAAA 15240 

GGGCACGATT AGATGTGTTT ATGGCATCCT CAGGTGAAAC TATATTTATA TTGACAATAT 15300 

ATTTATATTT CTCAAGGAAT ACTAGAATAA TGATTCAGTT CAGTACTAGG CCATTTATCT 1536 0 

ACCCTTTATA ATATTGTTTA ATGAGAAAAT GCTTTCTATC TTCCAAATAT CTGATGATTT 1542 0 

GTAAGAGAAC ACT T AAACAT GGGTATT CAT AAGCTGAAAC TTCTGGCATT TATTGAATGT 154 8 0 

CAAGATTGTT CATCAGTATA CTAGGTGATT AACTGACCAC TGAACTTGAA GGTAGTATAA 15 540 

AGTAGTAGTA AAAGGTACAA TCATTGTCTC TTAACAGATG GCTCTTTGCT TTCATTAGGA 15 6 00 



ATAAAG ATG GCT GCT GAA CCA GTA GAA GAC AAT TGC ATC AAC TTT GTG GCA 15651 
Met Ala Ala Glu Pro Val Glu Asp Asn Cys lie Asn Phe Val Ala 
-35 -30 -25 

ATG AAA TTT ATT GAC AAT ACG CTT TAC TTT ATA G GTAAGGC T AATG C CAT A 157 02 
Met Lys Phe lie Asp Asn Thr Leu Tyr Phe lie Ala 

-20 -15 -10 

GAACAAATAC CAGGTTCAGA TAAATCTATT CAATTAGAAA AGATGTTGTG AGGTGAACTA 15 7 62 

TTAAGTGACT CTTTGTGTCA CCAAATTTCA CTGTAATATT AATGGCTCTT AAAAAAATAG 158 22 

TGGACCTCTA GAAATTAACC ACAACATGTC CAAGGTCTCA GCACCTTGTC ACACCACGTG 15882 

TCCTGGCACT TTAATCAGCA GTAGCTCACT CTCCAGTTGG CAGTAAGTGC AC AT CAT GAA 15942 

AATCCCAGTT TTCATGGGAA AATCCCAGTT TTCATTGGAT TTCCATGGGA AAAATCCCAG 160 02 

TACAAAACTG GGTGCATTCA GGAAATACAA TTTCCCAAAG CAAATTGGCA AATTATGTAA 16062 

GAGATTCTCT AAATTTAGAG TTCCGTGAAT TACACCATTT TATGTAAATA TGTTTGACAA 1612 2 

GTAAAAATTG ATTCTTTTTT TTTTTTTCTG TTGCCCAGGC TGGAGTGCAG TGGCACAATC 16182 

TCTGCTCACT GCAACCTCCA CCTCCTGGGT TCAAGCAATT CTCCTGCCTC AGCCTTCTGA 16242 

GTAGCTGGGA CTACAGGTGC ATCCCGCCAT GCCTGGCTAA TTTTTGGGTA TTTTTACTAG 16 3 02 

AGACAGGGTT TTGGCATGTT GTCCAGGCTG GTCTTGGACT CCTGATCTCA GATGATCCTC 16362 

CTGGCTCGGG CTCCCAAAGT GCTGGGATTA CAGGCATGAA CCACCACACA TGGCCTAAAA 16422 

ATTGATTCTT ATGATTAATC TCCTGTGAAC AATTTGGCTT CATTTGAAAG TTTGCCTTCA 16482 

TTTGAAACCT TCATTTAAAA GCCTGAGCAA CAAAGTGAGA CCCCATCTCT ACAAAAAACT 1654 2 

GCAAAATATC CTGTGGACAC CTCCTACCTT CTGTGGAGGC TGAAGCAGGA GGATCACTTG 16602 

AGCCTAGGAA TTTGAGCCTG CAGTGAGCTA TGATCCCACC CCTACACTCC AGCCTGCATG 16662 

ACAGTAGACC CTGACACACA CACACAAAAA AAAACCTTCA TAAAAAATTA TTAGTTGACT 16722 

TTTCTTAGGT GACTTTCCGT TTAAGCAATA AATTTAAAAG TAAAATCTCT AATTTTAGAA 16782 

AATTTATTTT TAG T TAC AT A TTGAAATTTT T AAAC C C TAG GTTTAAGTTT TATGTCTAAA 16 842 

TTACCTGAGA ACACACTAAG TCTGATAAGC TTCATTTTAT GGGCCTTTTG GATGATTATA 16902 

TAATATTCTG ATGAAAGCCA AGACAGACCC TT AAAC CAT A AAAATAGGAG TTCGAGAAAG 16962 

AGGAGTAGCA AAAGTAAAAG CTAGAATGAG ATTGAATTCT GAGTCGAAAT ACAAAATTTT 17022 
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ACATATTCTG TTTCTCTCTT TTTCCCCCTC TTAG CT GAA GAT GAT G GTAAAGT 17 075 

Ala Glu Asp Asp Glu 
-10 

AGAAATGAAT TTATTTTTCT TTGCAAACTA AGTATCTGCT TGAGACACAT CTATCTCACC 1713 5 

ATTGTCAGCT GAGGAAAAAA AAAAATGGTT CTCATGCTAC CAATCTGCCT TCAAAGAAAT 17195 

GTGGACTCAG TAG C AC AG C T TTGGAATGAA GATGATCATA AGAGATACAA AGAAGAACCT 17 2 55 

CTAGCAAAAG ATGCTTCTCT ATGCCTTAAA AAATTCTCCA GCTCTTAGAA TCTACAAAAT 17315 

AGACTTTGCC TGTTTCATTG GTCCTAAGAT TAGCATGAAG CCATGGATTC TGTTGTAGGG 17 3 75 

GGAGCGTTGC ATAGGAAAAA GGGATTGAAG CATTAGAATT GTCCAAAATC AGTAACACCT 17435 

CCTCTCAGAA ATGCTTTGGG AAGAAGCCTG GAAGGTTCCG GGTTGGTGGT GGGGTGGGGC 174 95 

AGAAAATTCT GGAAGTAGAG GAGATAGGAA TGGGTGGGGG AAGAAGACCA CATTCAGAGG 17555 

CCAAAAGCTG AAAGAAAC C A TGGCATTTAT GATGAATTCA GGGTAATTCA GAATGGAAGT 17615 

AGAGTAGGAG TAGGAGACTG GTGAGAGGAG CTAGAGTGAT AAACAGGGTG TAGAGCAAGA 17 675 

CGTTCTCTCA CCCCAAGATG TGAAATTTGG ACTTTAT CTT GGAGATAATA GGGTTAATTA 17 73 5 

AGCACAATAT GTATTAGCTA GGGTAAAGAT TAGTTTGTTG TAACAAAGAC ATCCAAAGAT 17 7 95 

ACAGTAGCTG AATAAGATAG AGAATTTTTC TCTCAAAGAA AGTCTAAGTA GGCAGCTCAG 17855 

AAGTAGTATG GCTGGAAGCA ACCTGATGAT ATTGGGACCC CCAACCTTCT TCAGTCTTGT 17 915 

ACCCATCATC CCCTAGTTGT TGATCTCACT CACATAGTTG AAAAT CATC A TACTTCCTGG 17 97 5 

GTTCATATCC CAGTTATCAA GAAAGGGTCA AGAGAAGTCA GGCTCATTCC TTTCAAAGAC 18 035 

TCTAATTGGA AGTTAAACAC ATCAATCCCC CTCATATTCC ATTGACTAGA ATTTAATCAC 18095 

ATGGCCACAC CAAGTGCAAG GAAATCTGGA AAATATAATC TTTATTCCAG GTAGCCATAT 1815 5 

GACTCTTTAA AATTCAGAAA TAATATATTT TT AAAAT AT C ATTCTGGCTT TGGTATAAAG 18215 

AATTGATGGT GTGGGGTGAG GAGGCCAAAA TTAAGGGTTG AGAGCCTATT ATTTTAGTTA 18275 

TTACAAGAAA TGATGGTGTC ATGAATTAAG GTAGACATAG GGGAGTGCTG ATGAGGAGCT 18335 

GTGAATGGAT TTTAGAAACA CTTGAGAGAA TCAATAGGAC ATGATTTAGG GTTGGATTTG 183 95 

GAAAGGAGAA GAAAGTAGAA AAGATGATGC CTACATTTTT CACTTAGGCA ATTTGTACCA 19455 

TT C AGTG AAA TAGGGAACAC AGGAGGAAGA GCAGGTTTTG GTGTATACAA AGAGGAGGAT 18 515 

GGATGACG C A TTTCGTTTTG GATCTGAGAT GTCTGTGGAA CGTCCTAGTG G AG AT GT C C A 18 575 

CAAACTCTTC TACATGTGGT TCTGAGTTCA GGACACAGAT TTGGGCTGGA GATAGAGATA 1863 5 

TTGTAGGCTT ATACATAGAA ATGGCATTTG AATCTATAGA GATAAAAAGA CACATCAGAG 186 95 

GAAATGTGTA AAGTGAGAGA GGAAAAGC C A AGTACTGTGC TGGGGGGAAT ACCTACATTT 187 55 

AAAGGATGCA GTAGAAAGAA GCTAATAAAC AACAGAGAGC AG ACT AAC C A AAAGGGGAGA 18 815 

AGAAAAAC C A AGAGAATTCC ACCGACTCCC AGGAGAGCAT TTCAAGATTG AGGGGATAGG 18875 

TGTTGTGTTG AATTTTGCAG CCTTGAGAAT CAAGGGCCAG AACACAGCTT T TAG AT T T AG 1893 5 

CAACAAGGAG TTTGGTGATC TCAGTGAAAG CAGCTTGATG GTGAAATGGA GGCAGAGGCA 18995 

GATTGCAATG AGTGAAACAG TGAATGGGAA GTGAAGAAAT GATACAGATA ATTCTTGCTA 19055 

AAAGCTTGGC TGTTAAAAGG AGGAGAGAAA CAAGACTAGC TGCAAAGTGA GATTGGGTTG 19115 

ATGGAGCAGT TTTAAATCTC AAAATAAAGA GCTTTGTGCT TTTTTGATTA TGAAAATAAT 1917 5 

GTGTTAATTG TAACTAATTG AGGCAATGAA AAAAGATAAT AATATGAAAG ATAAAAATAT 1923 5 

AAAAACCACC CAGAAATAAT GATAGCTACC ATTTTGATAC AATATTTCTA CACTCCTTTC 192 95 

TAT G TAT AT A TACAGACACA GAAATGCTTA TATTTTTATT AAAAGGG AT T G T AC TAT AC C 193 55 

TAAGCTGCTT TTTCTAGTTA GTGATATATA TGGACATCTC TCCATGGCAA CGAGTAATTG 19415 

CAGTTATATT AAGTTCATGA TATTTCACAA TAAGGGCATA TCTTTGCCCT TTTTATTTAA 1947 5 

TCAATTCTTA ATTGGTGAAT GTTTGTTTCC AGTTTGTTGT TGTTATTAAC AATGTTCCCA 19535 

TAAGCATTCC TGTACACCAA TGTTCACACA TTTGTCTGAT TTTTTCTTCA GGATAAAACC 195 95 

CAGGAGGTAG AATTGCTGGG TTGATAGAAG AGAAAGGATG ATTGCCAAAT TAAAGCTTCA 19655 

GTAGAGGGTA CATGCCGAGC ACAAATGGGA TCAGCCCTAG AT AC C AG AAA TGGCACTTTC 19715 

TCATTTCCCC TTGGGACAAA AGGGAGAGAG GCAATAACTG TGCTGCCAGA GTTAAATTTG 19775 

TACGTGGAGT AGCAGGAAAT CATTTGCTGA AAATGAAAAC AGAGATGATG TTGTAGAGGT 19835 

CCTGAAGAGA GCAAAGAAAA TTTGAAATTG CGGCTATCAG CTATGGAAGA GAGTGCTGAA 19 8 95 

CTGGAAAACA AAAGAAGTAT TGACAATTGG TATGCTTGTA ATGGCACCGA TTTGAACGCT 19955 

TGTGCCATTG T T C AC C AG C A GCACTCAGCA GCCAAGTTTG GAGTTTTGTA G C AG AAAG AC 2 0 015 

AAATAAGTTA GGGATTTAAT ATCCTGGCCA AATGGTAGAC AAAATGAACT CTGAGATCCA 2 0 075 

GCTGCACAGG GAAGGAAGGG AAGACGGGAA GAGGTTAGAT AGGAAATACA AGAGTCAGGA 2 013 5 

GACTGGAAGA TGTTGTGATA TTTAAGAACA CATAGAGTTG GAGTAAAAGT GTAAGAAAAC 20195 

TAGAAGGGTA AGAGACCGGT CAGAAAGTAG GCTATTTGAA GTTAACACTT CAGAGGCAGA 2 02 55 

GTAGTTCTGA ATGGTAACAA GAAATTGAGT GTGCCTTTGA GAG TAG G T T A AAAAACAATA 2 0315 

GGCAACTTTA TTGTAGCTAC TTCTGGAACA GAAGATTGTC ATTAATAGTT TT AG AAAAC T 2 03 75 

AAAATATATA GC AT ACT TAT TTGTCAATTA ACAAAGAAAC TATGTATTTT TAAATGAGAT 20435 

TTAATGTTTA TTGTAG AA AAC CTG GAA TCA GAT TAC TTT GGC AAG CTT GAA 2 0486 
Glu Asn Leu Glu Ser Asp Tyr Phe Gly Lys Leu Glu 
-5 15 

TCT AAA TTA TCA GTC ATA AGA AAT TTG AAT GAC CAA GTT CTC TTC ATT 2 0534 
Ser Lys Leu Ser Val lie Arg Asn Leu Asn Asp Gin Val Leu Phe He 

10 15 20 

GAC CAA GGA AAT CGG CCT CTA TTT GAA GAT ATG ACT GAT TCT GAC TGT 2 0582 
Asp Gin Gly Asn Arg Pro Leu Phe Glu Asp Met Thr Asp Ser Asp Cys 
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25 30 35 

AGA G GT ATTTTTTTTA ATTCGCAAAC ATAGAAATGA CTAGCTACTT CTTCCCATTC 2 063 8 
Arg Asp 
40 

TGTTTTACTG CTTACATTGT TCCGTGCTAG TCCCAATCCT CAGATGAAAA GTCACAGGAG 2 06 98 

TGACAATAAT TTCACTTACA GGAAACTTTA TAAGGCATCC ACGTTTTTTA GTTGGGGTAA 20758 

AAAATTGGAT ACAATAAGAC ATTGCTAGGG GTCATGCCTC TCTGAGCCTG CCTTTGAATC 20818 

ACCAATCCCT TTATTGTGAT TGCATTAACT GTTTAAAACC TCTATAGTTG GATGCTTAAT 208 78 

CCCTGCTTGT TACAGCTGAA AATGCTGATA GTTTACCAGG TGTGGTGGCA TCTATCTGTA 2 0938 

ATCCTAGCTA CTTGGGAGGC TCAAGCAGGA GGATTGCTTG AGGCCAGGAC TTTGAGGCTG 2 0998 

TAGTACACTG TGATCGTACC TGTGAATAGC CACTGCACTC CAGCCTGGGT GATATACAGA 21058 

CCTTGTCTCT AAAATTAAAA AAAAAAAAAA AAAAAACCTT AGGAAAGGAA ATTGATCAAG 21118 

TCTACTGTGC CTTCCAAAAC ATGAATTCCA AATATCAAAG TTAGGCTGAG TTGAAGCAGT 21178 

GAATGTGCAT TCTTTAAAAA TACTGAATAC TTACCTTAAC ATATATTTTA AATATTTTAT 212 3 8 

TTAGCATTTA AAAGTTAAAA ACAATCTTTT AGAATTCATA TCTTTAAAAT ACTCAAAAAA 212 98 

GTTGCAGCGT GTGTGTTGTA ATACACATTA AACTGTGGGG TTGTTTGTTT GTTTGAGATG 21358 

CAGTTTCACT CTGTCACCCA GGCTGAAGTG CAGTGCAGTG CAGTGGTGTG ATCTCGGCTC 21418 

ACTACAACCT CCACCTCCCA CGTTCAAGCG ATTCTCATGC CTCAGTCTCC CGAGTAGGTG 214 78 

GGATTACAGG CATGCACCAC TTACACCCGG CTAATTTTTG TATTTTTAGT AGAGCTGGGG 21538 

TTTCACCATG TTGGCCAGGC TGGTCTCAAA CCCCTAACCT CAAGTGATCT GCCTGCCTCA 21538 

GCCTCCCAAA CAAACAAACA ACCCCACAGT TTAATATGTG TTACAACACA CATGCTGCAA 216 5 8 

CTTTTATGAG TATTTTAATG ATATAGATTA TAAAAGGTTG TTTTTAACTT TTAAATGCTG 21718 

GGATTACAGG CATGAGCCAC TGTGCCAGGC CTGAACTGTG TTTTTAAAAA TGTCTGACCA 21778 

GCTGTACATA GTCTCCTGCA GACTGGCCAA GTCTCAAAGT GGGAACAGGT GTATTAAGGA 218 3 8 

CTATCCTTTG GTTAAATTTC CGCAAATGTT CCTGTGCAAG AATTCTTCTA ACTAGAGTTC 218 98 

TCATTTATTA TATTTATTTC AG AT AAT GCA CCC CGG ACC ATA TTT ATT ATA 21949 

Asp Asn Ala Pro Arg Thr lie Phe lie lie 

40 45 

AGT ATG TAT AAA GAT AGC CAG CCT AGA GGT ATG GCT GTA ACT ATC TCT 21997 
Ser Met Tyr Lys Asp Ser Gin Pro Arg Gly Met Ala Val Thr lie Ser 
50 55 60 > 65 

GTG AAG TGT GAG AAA ATT TCA ACT CTC TCC TGT GAG AAC AAA ATT ATT 22045 
Val Lys Cys Glu Lys lie Ser Thr Leu Ser Cys Glu Asn Lys lie lie 

70 75 80 

TCC TTT AAG GTAAGACTG AGCCTTACTT TGTTTTCAAT CATGTTAATA TAAT CAAT AT 22103 
Ser Phe Lys 

AATTAGAAAT ATAACATTAT TTCTAATGTT AATATAAGTA ATGTAATTAG AAAACTCAAA 22163 

TATCCTCAGA CCAACCTTTT GTCTAGAACA GAAATAACAA GAAGCAGAGA ACCATTAAAG 22223 

TGAATACTTA CTAAAAATTA TCAAACTCTT TACCTATTGT GATAATGATG GTTTTTCTGA 22283 

GCCTGTCACA GGGGAAGAGG AGATACAACA CTTGTTTTAT GACCTGCATC TCCTGAACAA 22 323 

TCAGTCTTTA TACAAATAAT AATGTAGAAT ACATATGTGA GTTATACATT TAAGAATAAC 22403 

ATGTGACTTT C CAGAATG AG TTCTGCTATG AAGAATGAAG CTAATTATCC TTCTATATTT 22463 

CTACACCTTT GTAAATTATG ATAATATTTT AATCCCTAGT TGTTTTGTTG CTGATCCTTA 22 523 

GCCTAAGTCT TAGACACAAG CTTCAGCTTC CAGTTGATGT ATGTTATTTT TAATG TTAAT 22583 

C TAAT T G AAT AAAAGTTATG AGATCAGCTG TAAAAGTAAT GCTATAATTA TCTTCAAGCC 22643 

AGGTATAAAG TATTTCTGGC CTCTACTTTT TCTCTATTAT TCTCCATTAT TATTCTCTAT 22 703 

TATTTTTCTC TATTTCCTCC ATTATTGTTA GATAAACCAC AATTAACTAT AG CT AC AG AC 22763 

TGAGCCAGTA AG AGT AGC C A GGGATGCTTA CAAATTGGCA ATGCTTCAGA GGAGAATTCC 22823 

ATGTCATGAA GACTCTTTTT GAGTGGAGAT TTGCCAATAA ATATCCGCTT TCATGCCCAC 22883 

CCAGTCCCCA CTGAAAGACA GTTAGGATAT GACCTTAGTG AAGGTACCAA GGGGCAACTT 22943 

GGTAGGGAGA AAAAAGCCAC TCTAAAATAT AATCCAAGTA AGAACAGTGC ATATGCAACA 23 003 

GATACAGCCC CCAGACAAAT CCCTCAGCTA TCTCCCTCCA ACCAGAGTGC CACCCCTTCA 23 063 

GGTGACAATT TGGAGTCCCC ATTCTAGACC TGACAGGCAG CTTAGTTATC AAAATAGCAT 2 3123 

AAGAGGCCTG GGATGGAAGG GTAGGGTGGA AAG G GT T AAG CATGCTGTTA C TG AAC AAC A 23183 

TAATTAGAAG GGAAGGAGAT GGCCAAGCTC AAGCTATGTG GGATAGAGGA AAACTCAGCT 23243 

G CAG AGG CAG ATTCAGAAAC TGGGATAAGT CCGAACCTAC AGGTGGATTC TTGTTGAGGG 23303 

AGACTGGTGA AAATGTTAAG AAGATGGAAA TAATGCTTGG CACTTAGTAG GAACTGGGCA 23363 

AATC CAT ATT TGGGGGAGCC TGAAGTTTAT TCAATTTTGA TGGCCCTTTT AAATAAAAAG 2 3423 

AATGTGGCTG GGCGTGGTGG CTCACACCTG TAATCCCAGC ACTTTGGGAG GCCGAGGGGG 23483 

GCGGATCACC TGAAGTCAGG AGTTCAAGAC CAGCCTGACC AACATGGAGA AACCCCATCT 23 543 

CTACTAAAAA TACAAAATTA GCTGGGCGTG GTGGCATATG CCTGTAATCC CAGCTACTCG 23 603 

GGAGGCTGAG GCAGGAGAAT CTTTTGAACC CGGGAGGCAG AGGTTGCGAT GAGCCTAGAT 23663 

CGTGCCATTG CACTCCAGCC TGGGCAACAA GAGCAAAACT CGGTCTCAAA AAAAAAAAAA 23 723 

AAAAAGTGAA ATTAACCAAA GGCATTAGCT TAATAATTTA ATACTGTTTT TAAGTAGGGC 23 783 

GGGGGGTGGC TGGAAGAGAT CTGTGTAAAT GAGGGAATCT GACATTTAAG CTTCATCAGC 23 843 

ATCATAGCAA ATCTGCTTCT GGAAGGAACT CAATAAATAT TAGTTGGAGG GGGGGAGAGA 23 903 

GTGAGGGGTG GACTAGGACC AGTTTTAGCC CTTGTCTTTA ATCCCTTTTC CTGCCACTAA 23 963 
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TAAGGATCTT AGCAGTGGTT ATAAAAGTGG CCTAGGTTCT AGATAATAAG ATACAACAGG 24 023 

CCAGGCACAG TGGCTCATGC CTATAATCCC AGCACTTTGG GAGGGCAAGG CGAGTGTCTC 24083 

ACTTGAGATC AGGAGTTCAA GACCAGCCTG GCCAGCATGG CGATACTCTG TCTCTACTAA 2414 3 

AAAAAATACA AAAATTAGCC AGGCATGGTG GCATGCACCT GTAATCCCAG CTACTCGTGA 242 03 

GCCTGAGGCA GAAGAATCGC TTGAAACCAG GAGGTGTAGG CTGCAGTGAG CTGAGATCGC 24263 

ACCACTGCAC TCCAGCCTGG GCGACAGAAT GAGACTTTGT CTCAAAAAAA GAAAAAGATA 2 432 3 

CAACAGGCTA CCCTTATGTG CTCACCTTTC ACTGTTGATT ACTAGCTATA AAGTCCTATA 243 83 

AAGTTCTTTG GTCAAGAACC TTGACAACAC TAAGAGGGAT TTGCTTTGAG AGGTTACTGT 2444 3 

CAGAGTCTGT TT CAT AT AT A TACATATACA TGTATATATG TATCTATATC CAGGCTTGGC 24503 

CAGGGTTCCC TCAGACTTTC CAGTGCACTT GGGAGATGTT AGGTCAATAT CAACTTTCCC 24563 

TGGATTCAGA TTCAACCCCT TCTGATGTAA AAAAAAAAAA AAAAAAGAAA GAAATCCCTT 246 2 3 

TCCCCTTGGA GCACTCAAGT TTCACCAGGT GGGGCTTTCC AAGTTGGGGG TTCTCCAAGG 24683 

TCATTGGGAT TGCTTTCACA TCCATTTGCT ATGTACCTTC CCTATGATGG CTGGGAGTGG 24 743 

TCAACATCAA AACTAGGAAA GCTACTGCCC AAGGATGTCC TTACCTCTAT TCTGAAATGT 248 03 

GCAATAAGTG TGATTAAAGA GATTGCCTGT TCTACCTATC CACACTCTCG CTTTCAACTG 24 863 

TAACTTTCTT TTTTTCTTTT TTTCTTTTTT TCTTTTTTTT TGAAACGGAG TCTCGCTCTG 24 923 

TCGCCCAGGC TAGAGTGCAG TGGCACGATC TCAGCTCACT GCAAGCTCTG CCTCCCGGGT 24983 

TCACGCCATT CTCCTGCCTC ACCCTCCCAA GCAGCTGGGA CTACAGGCGC CTGCCACCAT 2 5043 

GCCCAGCTAA TTTTTTGTAT TTTTAGTAGA GACGGGGTTT CACCGTGTTA GCCAGGATGG 25103 

TCTCGATCTC CTGAACTTGT GATCCGCCCG CCTCAGCCTC CCAAAGTGCT GGGATTACAG 25163 

GCGTGAGCCA TCGCACCCGG CTCAACTGTA ACTTTCTATA CTGGTTCATC TTCCCCTGTA 2 5223 

ATGTTACTAG AGCTTTTGAA GTTTTGGCTA TGGATTATTT CT C ATTTATA CATTAGATTT 25283 

CAGATTAGTT CCAAATTGAT GCCCACAGCT TAGGGTCTCT TCCTAAATTG TATATTGTAG 25343 

ACAGCTGCAG AAGTGGGTGC CAATAGGGGA ACTAGTTTAT ACTTTCATCA ACTTAGGACC 254 03 

CACACTTGTT GATAAAGAAC AAAGGTCAAG AGTTATGACT ACTGATTCCA CAACTGATTG 25463 

AGAAGTTGGA GATAACCCCG TGACCTCTGC CATCCAGAGT CTTTCAGGCA TCTTTGAAGG 25 523 

ATGAAGAAAT GCTATTTTAA TTTTGGAGGT TTCTCTATCA GTGCTTAGGA TCATGGGAAT 2 5583 

CTGTGCTGCC ATGAGGCCAA AATTAAGTCC AAAACATCTA CTGGTTCCAG GATTAACATG 25643 

GAAGAACCTT AGGTGGTGCC CACATGTTCT GATCCATCCT GCAAAATAGA CATGCTGCAC 25703 

TAACAGGAAA AGTGCAGGCA GCACTACCAG TTGGATAACC TGCAAGATTA TAGTTTCAAG 2 576 3 

TAATCTAACC ATTTCTCACA AGGCCCTATT CTGTGACTGA AACATACAAG AATCTGCATT 25823 

TGGCCTTCTA AGGCAGGGCC CAGCCAAGGA GAC CAT ATT C AGGACAGAAA TTCAAGACTA 25883 

CTATGGAACT GGAGTGCTTG GCAGGGAAGA CAGAGTCAAG GACTGCCAAC TGAGCCAATA 2 5 943 

CAGCAGGCTT ACACAGGAAC CCAGGGCCTA GCCCTACAAC AATTATTGGG TC T ATT C ACT 26 003 

GTAAGTTTTA ATTTCAGGCT CCACTGAAAG AGTAAGCTAA GATTCCTGGC ACTTTCTGTC 26 063 

TCTCTCACAG TTGGCTCAGA AATGAGAACT GGTCAGGCCA GGCATGGTGG CTTACACCTG 2 612 3 

GAATCCCAGC ACTTTGGGAG GCCGAAGTGG GAGGGTCACT TGAGGCCAGG AGTTCAGGAC 2 618 3 

CAGCTTAGGC AACAAAGTGA GATACCCCCT GACCCCTTCT CTACAAAAAT AAATTTTAAA 26243 

AATTAGCCAA ATGTGGTGGT GTATACTTAC AGTCCCAGCT ACTCAGGAGG CTGAGGCAGG 2 63 03 

GGGATTGCTT GAGCCCAGGA ATTCAAGGCT GCAGTGAGCT ATGATTTCAC CACTGCACTT 26363 

CTGGCTGGGC AACAGAGCGA GACCCTGTCT CAAAGCAAAA AGAAAAAGAA ACTAGAACTA 2 6423 

GCCTAAGTTT GTGGGAGGAG GTCATCATCG TCTTTAGCCG TGAATGGTTA TTATAGAGGA 26483 

CAGAAATTGA CATTAGCCCA AAAAGCTTGT GGTCTTTGCT GGAACTCTAC TTAATCTTGA 2 6 543 

GCAAATGTGG ACACCACTCA ATGGGAGAGG AGAGAAGTAA GCTGTTTGAT GTATAGGGGA 2 6 603 

AAACTAGAGG CCTGGAACTG AATATGCATC CCATGACAGG GAGAATAGGA GATTCGGAGT 26663 

TAAGAAGGAG AGGAGGTCAG TACTGCTGTT CAGAGATTTT TTTTATGTAA CTCTTGAGAA 2 6 723 

GCAAAACTAC TTTTGTTCTG TTTGGTAATA TACTTCAAAA CAAACTTCAT AT AT T C AAAT 26 783 

TGTTCATGTC CTGAAATAAT TAGGTAATGT TTTTTTCTCT AT AG GAA ATG AAT CCT 26839 

Glu Met Asn Pro 

85 



CCT 


GAT 


AAC 


ATC 


AAG 


GAT 


ACA 


AAA 


AGT 


GAC 


ATC 


ATA 


TTC 


TTT 


CAG 


AGA 


26887 


Pro 


Asp 


Asn 


He 


Lys 


Asp 


Thr 


Lys 


Ser 


Asp 


He 


He 


Phe 


Phe 


Gin 


Arg 






90 










95 








100 












AGT 


GTC 


CCA 


GGA 


CAT 


GAT 


AAT 


AAG 


ATG 


CAA 


TTT 


GAA 


TCT 


TCA 


TCA 


TAC 


26935 


Ser 


Val 


Pro 


Gly 


His 


Asp 


Asn 


Lys 


Met 


Gin 


Phe 


Glu 


Ser 


Ser 


Ser 


Tyr 




105 










110 










115 










120 




GAA 


GGA 


TAC 


TTT 


CTA 


GCT 


TGT 


GAA 


AAA 


GAG 


AGA 


GAC 


CTT 


TTT 


AAA 


CTC 


26983 


Glu 


Gly 


Tyr 


Phe 


Leu 
125 


Ala 


Cys 


Glu 


Lys 


Glu 
130 


Arg 


Asp 


Leu 


Phe 


Lys 
135 


Leu 




ATT 


TTG 


AAA 


AAA 


GAG 


GAT 


GAA 


TTG 


GGG 


GAT 


AGA 


TCT 


ATA 


ATG 


TTC 


ACT 


27031 


He 


Leu 


Lys 


Lys 


Glu 


Asp 


Glu 


Leu 


Gly 


Asp 


Arg 


Ser 


He 


Met 


Phe 


Thr 





140 145 150 



GTT CAA AAC GAA GAC T AGCTATTAAA ATTTCATGCC GGGCGCAGTG GCTCACGCCT 27087 
Val Gin Asn Glu Asp 
155 

GTAATCCCAG CCCTTTGGGA GGCTGAGGCG GG CAG ATC AC CAGAGGTCAG GTGTTCAAGA 27147 
CCAGCCTGAC CAACATGGTG AAACCTCATC TCTACTAAAA ATACAAAAAA TTAGCTGAGT 272 07 
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GTAGTGACCC ATGCCCTCAA TCCCAGCTAC TCAAGAGGCT GAGGCAGGAG AATCACTTGC 2 7267 

ACTCCGGAGG TGGAGGTTGT GGTGAGCCGA GATTGCACCA TTGCGCTCTA GCCTGGGCAA 27327 

C AAC AG C AAA ACTCCATCTC AAAAAATAAA ATAAATAAAT AAACAAATAA AAAATTCATA 273 8 7 

ATGTGAACTG TCTGAATTTT TATGTTTAGA AAGATTATGA GATTATTAGT CTATAATTGT 2 7447 

AATGGTGAAA TAAAATAAAT ACCAGTCTTG AAAAACATCA TTAAGAAATG AATGAACTTT 27 507 

CACAAAAGCA AACAAACAGA CTTTCCCTTA TTTAAGTGAA TAAAATAAAA TAAAATAAAA 27567 

TAATGTTTAA AAAATTCATA GTTTGAAAAC ATTCTACATT GTTAATTGGC ATATTAATTA 27627 

TACTTAATAT AATTATTTTT AAATCTTTTG GGTTATTAGT CCTAATGACA AAAGATATTG 27687 

ATATTTGAAC TTTCTAATTT TTAAGAATAT CGTTAAACCA TCAATATTTT TATAAGGAGG 27 747 

CCACTTCACT TGACAAATTT CTGAATTTCC TCCAAAGTCA GTATATTTTT AAAATTCAGT 27807 

TTGATCCTGA ATCCAGCAAT ATATAAAAGG GATTATATAC TCTGGCCAAC TG A C ATT CAT 27 8 67 

CCTAGGAATG CAAAGATGGT TTAATATCCT AAAATCAATT AACATAACAT ACTATATTAA 27927 

TAAAGT AT CA AAACAGTATT CTCATCTTTT TTTCTTTTTT CACAATTCCT TGGTTACACT 2 7987 

ATCATCTCAA TAGATGCAGA AAAAGCATTT GACAAAATCC AATTCATAAT AAAAATTCTC 28 047 

AAACTTGAAA GAGAACATCA TAAAGGCATC TATGAAAAAC CTACAGCTAA TATCATACTT 28107 

AACGATGAAA AACTGAATTA TTTTACCCTA AGATCAAGAA TAATGCAAGC ATGTCAGCTC 2 8167 

TTGCAACTTC TATTCAACAT TGTACTGGAG GTTCTAGCCA GAGCAACCAT ACAATAAATA 28227 

AAAATAAAAG GCACCCAGAT TAGAAAGGAA GTCTTTATTT GCAGACAACA TGGTTCTTTA 2 8287 

TGCAGAAAAC CGTCAGGAAT ACACACACAT GTTAGAACTA ATAAGTTCAG CAAGGTTGCA 2 8347 

GGTTGCAATA TCAATATGCA AAAATACATT GAAGGCTGGG CTCAGTGGAG ATGGCATGTA 2 8407 

CCTTTCGTCC CAGCTACTTG GGAGGCTGAG GTAGGAGGAT CACTTGAGGT GAGGAGTTTG 2 8467 

AGGCTATAGT GCAATGTGAT CTTGCCTGTG AATAGC CACT GCACTCGAGC CTAGGCAACA 2 8 527 

AAGTG AG AC C CCGTCTCCAA AAAAAAAAAT GGTATATTGG TATTTCTGTA TATGAACAAT 2 8587 

GAATGATCTG AAAACAAGAA AATTCCATTC ACGATGGTAT TAAAAAAATA AAATACAAAT 2 8647 

AAATTTAGCA AAATAATTAT AAAAC TTGT A CATCGAAAAT TTCAAAGCAC TCTGAGGGAA 2 8 707 

ATTAAAGATG AT CTAAAT AA TTGGAGAGAC ACTCTATGAT CACTGATTGG AAAATTCATT 2 8767 

CAATATTGTT AAGATAACAA TTGTCCCCAA ATTGATGCAT GCATTCAATT TAGTCTTCAT 28827 

CAAAATTCCA GCAGGGTTTT TGCAGAAATT GACAAGCTGT ACCCAAAATG TATATGGAAA 2 888 7 

TGAAAAGACC CAGAAGAGCA AATAATTTTT TAAAAACAAA GTTGGAAAAC TTTTACTTCC 28947 

TAATTTTAAA AC TT ACT AT A AACCTAAAGT TATCAAGACC ATTTAGT 28994 

(15) INFORMATION FOR SEQ ID NO : 15: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: N-terminal fragment 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Tyr Phe Gly Lys Leu Glu Ser Lys Leu Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 16 : 
CCATCCTAAT ACGACTCACT ATAGGGC 27 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 28 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TTCCTCTTCC CGAAGCTGTG TAGACTGC 



(2) INFORMATION FOR SEQ ID NO:18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 18 : 
CTATAGGGCA CGCGTGGT 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 19 : 
TTCCTCTTCC CGAAGCTGTG TAGACTGC 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
GTAAGTTTTC ACCTTCCAAC TGTAGAGTCC 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 



GGGATCAAGT CGTGATCAGA AGCAGCACAC 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 
CCTGGCTGCC AACTCTGGCT GCTAAAGCGG 



(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 
GTATTGTCAA TAAATTTCAT TGCCACAAAG TTG 



(2) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 
AAGATGGCTG CTGAACCAGT AGAAGACAAT TGC 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25 
TCCTTGGTCA ATGAAGAGAA CTTGGTC 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 33 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 
CCTGGAATCA GATTACTTTG GCAAGCTTGA ATC 



(2) INFORMATION FOR SEQ ID NO:27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 2 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 
GGAAATAATT TTGTTCTCAC AG GAG AG AG T TG 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 
GCCAGCCTAG AGGTATGGCT GTAACTATCT C 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 
GGCATGAAAT TTTAATAGCT AGTCTTCGTT TTG 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30 



GTGACATCAT ATT CTTTC AG AGAAGTGTCC 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31 
GCAATTTGAA TCTTCATCAT ACGAAGGATA C 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32 
TCCGAAGCTT AAGATGGCTG CTGAACCAGT A 



(2) INFORMATION FOR SEQ ID NO:33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33 
GGAAATAATT TTGTTCTCAC AGGAGAGAGT TG 



(2) INFORMATION FOR SEQ ID NO:34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 34 
ATGTAGCGGC CGCGGCATGA AATTTTAATA GCTAGTC 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 3 3 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
CCTGGAATCA GATTACTTTG GCAAGCTTGA ATC 
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